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ABSTRACT  OF  THE  DISSERTATION 


Error-Coded  Algorithms  for  On-Line  Arithmetic 

by 

Abdolali  Gorji-Sinaki 
Doctor  of  Philosophy  in  Computer  Science 
University  of  California,  Los  Angeles,  1981 
Professor  Milos  D.  Ercegovac,  Chair 

Since  on-line  arithmetic  requires  relatively  long  se¬ 
quences  of  operations  in  order  to  achieve  speed-up  over  con¬ 
ventional  arithmetic,  it  is  important  to  protect  on-line  al¬ 
gorithms  against  hardware  failures.  If  not  protected,  the 
hardware  failures  could  quickly  contaminate  large  number  of 
results  in  progress  due  to  tight  coupling  of  the  steps  at 
the  digit  level.  By  detecting  errors,  as  they  occur,  an  ef¬ 
fective,  gracefully  degradable  organization  could  be 
achieved.  Namely,  error  at  any  step  of  the  algorithms  would 
lead  to  restriction  of  precision  (significance)  of  the 
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remaining  steps  but  not  catastrophic  termination. 

The  main  objective  of  this  dissertation  is  to  develop 
and  demonstrate  the  feasibility  of  error-coded  on-line  ar¬ 
ithmetic  suitable  for  distributed  systems. 

In  this  thesis  a  set  of  error-coded  on-line  algorithms 
was  developed  for  the  four  basic  operations  of 
addition/subtraction,  multiplication  and  division.  Low  cost 
arithmetic  error  codes  (Residue  and  AN  Codes)  were  found  to 
be  suitable  for  this  purpose. 

Hardware  design  of  the  error-coded  units  at  the  gate 
level  was  considered.  A  residue-coded  on-line  division  unit 
was  designed  based  on  a  already  designed  digit-slice  divi-  1 

sion  unit. 

A  general  mathematical  model  for  the  cost  and  speed  of 
the  error-coded  units  was  derived  and  was  compared  with 
similar  values  when  no  error  code  is  used.  Finally,  the  ef¬ 
fectiveness  of  the  proposed  detection/correction  schemes  was 
considered  ar.d  proved. 
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CHAPTER  1 


INTRODUCTION 

i’i  Motivations  and  Objectives 

This  thesis  is  concerned  with  the  development  of  a  set 
of  error  coded  basic  algorithms  for  on-line  arithmetic.  In 
on-line  processing  the  operands,  as  well  as  the  results, 
flow  through  the  arithmetic  unit  in  a  digit-by-digit  manner 
starting  with  the  most  significant  digit.  On-line  arithmet¬ 
ic  provides  a  simple  approach  to  achieve  higher  computation¬ 
al  rates  by  allowing  overlap  at  the  digit  level  between  the 
successive  operations  [ERC  75,  TRI  77,  IRW  77].  In  particu¬ 
lar,  on-line  arithmetic  is  highly  attractive  in  some  special 
applications,  such  as  serial  real-time  processing,  variable 
precision  arithmetic  and  data  flow  architecture.  Because  of 
the  serial  nature  of  the  algorithms,  they  might  be  used  ef¬ 
fectively  in  conjunction  with  large  serial  memories  (CCDs, 
Bubble,  etc.).  On-line  arithmetic  offers  a  number  of  trade¬ 
offs  in  system  organization  (interconnection  and  memory 
structures)  that  warrant  additional  research  in  this  area. 


Since  on-line  arithmetic  requires  relatively  long  se¬ 
quences  of  operations  in  order  to  achieve  speed-up  over  con¬ 
ventional  arithmetic,  it  is  important  to  protect  on-line  al- 
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gorithms  against  hardware  failures.  If  not  protected,  the 
hardware  failures  could  quickly  contaminate  large  number  of 
results  in  progress  due  to  tight  coupling  of  steps  at  the 
digit  level.  By  detecting  errors,  as  they  occur,  an  effec¬ 
tive,  gracefully  degradable  organization  could  be  achieved. 
Namely,  error  at  the  j-th  step  would  lead  to  restriction  of 
precision  (significance)  of  the  remaining  steps  but  not  ca¬ 
tastrophic  termination . 

In  this  thesis  we  address  the  problem  of  developing 
such  detection  and  correction  procedures.  We  shall  show  that 
low-cost  arithmetic  error  codes  can  be  used  effectively  to 
support  error-coded  on-line  arithmetic.  Low  cost  error 
codes  are  advantageous  because  of  the  very  simple  checking 
procedure  and  cost-effective  implementation. 

In  the  rest  of  the  current  chapter  we  review  the 
state-of-art  in  on-line  algorithms  and  consider  some  of 
their  properties  and  applications.  In  Chapter  2  of  this 
thesis  a  summary  of  the  existing  error-codes  will  be  given. 
Chapter  3  and  4  are  the  main  results  of  this  work  and  deal 
with  the  presentation  of  the  detection/correction  schemes 
and  their  hardware  implementation.  In  Chapter  5  performance 
of  the  error-coded  units  will  be  considered  and  their  cost 
and  speed  will  be  compared  with  the  corresponding  ordinary 
on-line  units.  Chapter  6  contains  the  summary  of  the  results 
obtained  and  some  suggestions  for  the  future  research  in  the 
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area  of  on-line  arithmetic 


\_.2  On-Line  Arithmetic 


1^.1^  Definitions 

By  on-line  algorithms  we  mean  those  arithmetic  algo¬ 
rithms  in  which  the  operands  as  well  as  the  results  flow 
through  the  arithmetic  unit  in  a  digit-by-digit  fashion, 
most  significant  digits  first.  These  algorithms  are  such 
that,  in  order  to  generate  the  j-th  digit  of  the  result, 
(j+S)  digits  of  the  corresponding  operands  are  required.  8 
is  called  the  on-line  delay  and  is  preferred  to  be  as  small 
as  possible  (Figure  1.1). 

xj  +  8 

v* 

t  :  1  2  ...  6  8  +1  ...  n  n+1  . . .  n+6 

0 
0 

*n 

Unit 


INPUT 


X1X2 

VlV2 


x8  x8+1 

Vs  Vs+1 


*n  0 


Vr  0 


OUTPUT  •  —  —  ...  —  2^  ...  ... 

Figure  (1.1)-  An  On-Line  Arithmetic 


It  is  not  difficult  to  see  that  the  use  of  redundant  number 
representation  is  mandatory  for  on-line  algorithms.  If  we 
were  to  use  a  non-redundant  number  system,  then  even  for 
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simple  operations  like  addition  and  subtraction  there  is  an 
on-line  delay  of  S=m  due  to  carry  propagation  (m  is  the 
length  of  the  operands) .  By  using  the  signed-digit 
representation  of  numbers  [AVI  61 ],  it  is  possible  to  limit 
the  carry  propagation  to  one  digit  position. 

Background 

In  general ,  an  on-line  algorithm  is  specified  recur¬ 
sively  in  term  of  on-line  representation  of  operands, 
results  and  some  internal  values.  The  following  are  the 
steps  of  a  typical  on-line  algorithm: 

1.  Initialization: 

2.  Basic  Recursion  Step: 

Pj*f(Pj_l.Xj+s,yj+6,Zj) 

Where  f  is  a  linear  function  and  Pj  is  the  partial  result. 

3.  Selection  Step: 

z  j+l‘*_SELECT^  j'x  j+S'^j+S* 

Several  of  the  well  known  basic  algorithms  satisfy  the 
on-line  property  with  respect  to  either  the  operands  or  the 
results.  Consider,  for  example,  conventional  division  which 
has  the  on-line  property  with  respect  to  the  quotient  di¬ 
gits.  Similarly,  conventional  multiplication  has  the  on-line 
property  with  respect  to  the  multiplier.  This  property  has 
later  been  extended  to  the  product  digits  as  well. 
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As  was  mentioned  earlier  in  this  section,  allowing 
redundancy  in  number  representation  will  speed  up  the  opera¬ 
tion  by  limiting  the  carry  propagation.  A  well  known  example 
is  the  totally  parallel  addition/subtraction  with  §=1  [AVI 
61].  More  recent  work  in  the  area  of  on-line  computation  has 
been  done  by  Ercegovac  and  Trivedi  [ERC  75,  TRI  77  and,  TRI 
78].  They  developed  on-line  algorithms  for  multiplication 
and  division.  An  overview  of  the  generalized  (with  respect 
to  radix)  on-line  algorithms  for  addition/subtraction,  mul¬ 
tiplication  and,  division  has  appeared  in  [IRW  77]  along 
with  the  design  of  an  on-line  arithmetic  unit.  Others  have 
extended  on-line  algorithms  to  encompass  the  on-line  square 
rooting  [ERC  78,  OKL  78]  and  on-line  normalization  [GRN  79]. 
Also,  a  systematic  method  for  derivation  of  on-line 
addition/subtraction,  multiplication  and,  division  algo¬ 
rithms  appears  in  [GOR  80].  Several  on-line  algorithms  such 
as  y~ax+b,  have  been  developed  and  used  in  iterative  struc¬ 
tures  for  array  computations.  Typical  problems,  such  as 
matrix-vector  multiplication  and  solving  linear  recurrence 
systems,  have  been  investigated  and  corresponding  solutions 
using  on-line  approaches  are  proposed  and  evaluated  [ERC  80, 
GRN  80].  Other  on-line  algorithms  and  structures  are  re¬ 
ported  in  [CHU  80].  In  order  to  efficiently  explore  and 
develop  on-line  algorithms  a  highly  functional  simulator  has 
been  developed  and  it  is  running  on  a  DEC  VAX  I 1/780  [RAG 
80]. 
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1^2. 3  Performance  of  On-Line  Algorithms 

There  are  seven  major  criteria  that  should  be  con¬ 
sidered  in  evaluating  the  performance  of  a  computational  al¬ 
gorithm.  These  seven  criteria  are  listed  below: 

1.  Speed:  throughput  and  delay 

2.  Cost:  processor,  processor  types  and,  storage  require¬ 
ments  . 

3.  Control  efficiency 

4.  Interconnection  requirements 

5.  Flexibility 

6.  Modularity 

7.  Reliability/HRobustness,‘  of  the  algorithms. 

The  potential  of  on-line  arithmetic  in  achieving  a  high 
performance  has  long  been  recognized  and  because  of  this 
property,  a  number  of  basic  on-line  algorithms  have  been 
developed  in  the  literature  (for  the  corresponding  refer¬ 
ences  see  1.2.2).  Also  a  paper,  written  by  Ercegovac  and 
Grnarov  [ERC  80],  analyzes  the  performance  of  on-line  arith¬ 
metic  structures.  It  provides  a  relative  comparison  with  the 
conventional  arithmetic  in  computational  problems  such  as 
the  evaluation  of  scalar  and  vector  expressions  and  re¬ 
currence  systems.  In  what  follows  we  analyze  the  performance 
of  on-line  algorithms  with  respect  to  the  seven  criteria 


mentioned  above. 


The  speed-up  of  on-line  algorithms  is  achieved  by  con¬ 
sistently  applying  a  digit-serial  mode  of  operation  where 
the  operands  and  the  results  are  processed  beginning  with 
the  most  significant  digit*  Therefore,  successive  operations 
can  be  overlapped  at  the  digit  level  and  the  interconnection 
requirements  between  arithmetic  units  are  reduced  to  a 
minimum.  Also  by  using  a  redundant  representation  of  the 
partial  results,  it  is  possible  to  limit  the  carry  propaga¬ 
tion.  Consequently  the  time  required  to  compute  one  output 
digit  can  be  made  independent  of  the  length  of  the  operands. 

Using  a  higher  radix  may  also  increase  the  speed  of  the 
computation  by  reducing  the  necessary  number  of  steps  for  a 
given  precision.  But  at  the  same  time  it  increases  the  time 
to  perform  the  basic  recursion  and  the  complexity  of  the 
corresponding  on-line  unit.  Ercegovac  and  Grnarov  in  their 
paper  [ERC  80]  compared  the  speed  of  a  multilevel  on-line 
unit  with  the  corresponding  conventional  unit  demonstrating 
that  for  m=32  (m  is  the  number  of  digits  of  the  result),  a 
network  with  two  or  more  levels  is  faster  in  on-line  arith¬ 
metic  than  in  conventional  arithmetic.  They  also  showed  that 
the  time  required  to  perform  an  operation  is  linearly  pro¬ 
portional  to  the  required  precision.  The  results  of  their 
study  indicate  that  by  using  on-line  arithmetic  (besides 
highly  reduced  communication  requirements  and  modular,  uni- 


form  implementation)  one  can  expect  an  additional  speed-up 
factor  of  2-10. 

Pipelining  of  successive  operations  can  also  be  used  as 
an  effective  speed-up  technique  [AVI  70/  TUN  70/  ERC  80].  In 
this  scheme  multiple  on-line  units  are  connected  together  in 
such  a  way  that  when  the  first  unit  completes  processing,  it 
passes  all  the  necessary  informations  down  the  pipe  and  to 
the  next  unit.  When  one  unit  has  completed  all  of  the  pro¬ 
cessing  associated  with  the  present  operation,  the  next  unit 

in  line  can  begin  generating  the  next  result  .digit  associat- 

* 

ed  with  that  same  instruction.  In  this  way,  the  fraction  ar¬ 
ithmetic  unit,  which  has  been  traditionally  considered  as  a 
single  stage  of  the  pipeline,  can  be  further  decomposed  into 
multiple  stages  to  speed  up  processing  even  more.  Chaining 
operations  on  result  digit  as  they  become  available  can  in¬ 
crease  processing  speed  even  more. 

Cost 

The  cost  of  on-line  networks  is  a  function  of  the  cost 
of  on-line  arithmetic  units  and  the  cost  of  communication 
between  the  corresponding  modules.  Since  in  an  on-line  en¬ 
vironment  the  interconnection  between  modules  is  via  a  one¬ 
digit  wide  link,  the  communication  cost  is  obviously  less 
than  that  of  a  conventional  network.  In  a  conventional  net¬ 
work  the  number  of  data  links  between  two  modules  is  propor¬ 
tional  to  the  number  of  digits  transferred  which  is  usually 


a  full  precision  number*  On  the  other  hand  the  number  of 
modules  required  to  implement  a  conventional  arithmetic  unit 
is  at  least  proportional  to  m,  while  the  corresponding 
number  in  an  on-line  environment  is  proportional  to  m/2  [ERC 
80]*  This  factor  also  reduces  the  cost  of  on-line  networks 
with  respect  to  conventional  one.  Ercegovac  and  Grnarov  [ERC 
80]  proved  that  the  sufficient  condition  needed  for  an  on¬ 
line,  non-pipelined  network  to  be  less  costly  than  the  con¬ 
ventional  one  is  that  the  cost  of  the  on-line  modules  should 
not  be  more  than  twice  the  cost  of  the  conventional  module. 

Control 

Typically,  the  most  random  part  of  any  system  is  its 
control  logic-  This  randomness  in  logic  makes  the  design  of 
the  control  part  of  the  system  cumbersome  and  expensive.  In 
order  to  alleviate  this  problem  it  is  possible  to  micropro¬ 
gram  the  on-line  unit  possibly  via  a  PLA  to  avoid  randomness 
in  control  logic.  On  the  other  hand,  since  the  basic  compu¬ 
tational  step,  in  an  on-line  algorithm,  is  invariant  at 
every  step  j  and  the  only  primitive  arithmetic  operation  is 
addition,  the  control  section  can  be  designed  in  a  straight¬ 
forward  manner.  Ercegovac  showed  that  the  control  require¬ 
ments  of  an  on-line  unit  is  very  simple.  Assuming  a  syn¬ 
chronous  mode  of  operation  of  the  entire  configuration,  he 
showed  that,  the  synchronizing  clock  pulses  on  which  the 
transfer  of  digits  occur,  are  all  that  is  needed  and  the 


same  clock  pulses,  defining  the  basic  step,  are  distributed 
to  all  units  [ERC  75].  Finally,  it  is  worth  noting  that, 
even  though  the  on-line  algorithms  are  iterative  in  nature, 
there  are  no  convergence  tests  to  be  performed  and  this 
makes  the  control  part  simple  and  deterministic. 

I nt erconnect ion  Requirements 

As  was  mentioned  earlier  in  this  section,  one  of  the 
advantages  of  on-line  units,  in  addition  to  a  simple  comput¬ 
ing  block,  is  the  simplicity  of  communication  between  the 
corresponding  modules.  This  reduction  in  internal  and  exter¬ 
nal  communication  requirements,  comes  from  the  fact  that 
each  module's  control  sees  only  its  own  state,  therefore  the 
interconnection  among  the  elementary  on-line  units  requires 
only  single  digit  links.  With  regard  to  this,  the  structure 
using  on-line  arithmetic  can  be  implemented  in  a  highly 
modular  manner.  Pipelining  of  the  on-line  modules  will  also 
increase  the  complexity  of  the  units,  while  the  communica¬ 
tion  required  between  units  will  increase  the  links  and 
therefore  the  pin  count  of  each  unit. 

Flexibility 

The  ability  of  the  on-line  methods  to  perform  without 
severe  degradation  while  using  the  limited  resources,  (in 
other  words  their  implementation  flexibility)  is  also  of 
practical  importance.  The  on-line  structures  are  easily  ex- 
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tendible  to  accommodate  either  more  levels  or  higher  preci¬ 
sion.  Ercegovac  proved  that  the  proposed  on-line  method  can 
be  implemented  under  a  wide  range  of  speed/cost  constraints 


in  a  simple  way  [ERC  75].  His  method  requires  for  the 
fastest  evaluation,  a  configuration  of  m  identical  elementa¬ 
ry  units,  but  allows,  in  a  straight  forward  manner,  exploi¬ 
tation  of  its  flexibility  in  tradeoff  between  the  speed  and 
cost.  The  cost  change  in  precision,  in  the  number  of  ele¬ 
mentary  units  or  in  their  complexity,  affects  the  speed  of 
computation  linearly . 

Modularity 

It  was  previously  mentioned  that,  the  interconnections 
in  an  on-line  arithmetic  network  are  much  simpler  than  in  a 
conventional  one,  since  only  single  digits  are  transferred 
between  the  operational  units.  Therefore,  the  structures  us¬ 
ing  on-line  arithmetic  can  be  implemented  in  a  highly  modu¬ 
lar  manner.  This  property  makes  the  arithmetic  unit  expand¬ 
able  both  from  the  individual  chip  and  the  overall  system 
viewpoint.  In  order  to  achieve  this,  the  processing  logic  of 
on-line  units  should  be  partitioned  to  make  it  suitable  to 
LSI.  Logic  partitioning  involves  the  organization  of  the 
internal  logic  structures  so  that  large  functional  areas (or 
arrays)  on  the  chip  can  be  grouped  together  and  used  repeti¬ 
tively.  External  to  the  chip,  functional  partitioning  of  the 
overall  system  requires  a  framework  consisting  of  modules 


which  are  completely  self-contained  processors,  each  having 
its  own  local  store,  processing  logic,  and  the  control 
necessary  for  the  module  to  execute  its  function.  Thus,  each 
module  acts  as  a  small  insular  unit  of  logic.  A  good  exam¬ 
ple  of  such  a  building  block,  for  signed-digit  arithmetic, 
is  the  single-package  arithmetic  processor  called  the  Arith¬ 
metic  Building  Element  (ABE)  [AVI  70].  In  the  on-line  en¬ 
vironment,  a  typical  module,  implemented  in  a  LSI  technique, 
can  be  a  16  bit  unit  with  a  4-  operand  adder,  4  registers, 
and  a  selection  and  carry  block  which  can  be  by-passed  so 
that  a  larger  precision  unit  can  be  simply  constructed  by 
concatenating  the  required  number  of  basic  modules  [ERC  75]. 
An  organization  of  on-line  unit  as  a  linear  array  of  identi¬ 
cal  modules  operating  in  parallel  is  shown  in  Figure  1.2. 


Figure  (1.2)  A  Modular  Organization  of  an  On-Line  Unit 
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Reliability 


The  reliability  of  on-line  algorithms  is  of  the  major 
concern  in  this  thesis.  We  are  trying  to  enhance  their 
"robustness"  by  applying  error  detection  and  correction  to 
the  already  developed  algorithms.  The  main  result  of  this 
work  is  presented  in  Chapter  3  where  error  coded  algorithms 
for  four  basic  operations  of  addition,  subtraction,  multi¬ 
plication  and  division  are  defined.  Low-cost  arithmetic  er¬ 
ror  codes  (Residue  and  AN  codes)  are  found  to  be  perfectly 
suitable  for  this  purpose  because  the  checking  procedure  is 
very  simple  and  cost-effective  to  implement. 


The  most  obvious  use  of  on-line  arithmetic  is  in  the 
area  of  real-time  processing  in  which  the  operands  are  gen¬ 
erated  serially  by  an  analog-to-digital  conversion  process 
beginning  with  the  most  significant  digits.  An  on-line  unit 
can  be  used  to  process  these  digits  as  soon  as  they  become 
available.  This  is  unlike  the  conventional  setup,  where  the 
processing  unit  must  wait  while  the  full  precision  operands 
are  converted  before  starting  the  operation.  The  speed  up 
benefits  are  obvious.  In  fact,  any  system  designed  to  be  of 
use  in  a  real-time  environment  can  make  significant  gains 
with  the  addition  of  an  on-line  module  to  its  hardware. 


Another  possible  application  is  in  performing  variable 
precision  arithmetic.  The  existing  algorithms  and  their 
simple  implementation  requirements  are  compatible  with  the 
required  modularity  of  any  variable  precision  unit.  It  is 
believed  that  sufficient  register  and  adder  widths  can  be 
provided  by  large  scale  integrated  technology  to  provide 
enough  "variable  precision  arithmetic"  to  meet  the  demands 
of  most  applications  [AVI  62].  As  a  result,  a  unit  which 


operates  in  an  on-line  fashion  can  provide  the  ever  popular 
microprocessor,  a  device  traditionally  restricted  from  most 
mathematical  applications  because  of  its  short  word  length, 
with  variable  precision  arithmetic  capabilities. 

Large-Scale  computing  applications  of  on-line  arithmet¬ 
ic  has  been  considered  in  [WAT  80].  In  this  research  a  mul¬ 
tiprocessor  organization  for  large-scale  numerical  behavior 
of  algorithms  has  been  studied. 

On-line  arithmetic  can  also  be  used  in  conjunction  with 
large  serial  memories  (CCDs,  Bubble  memories,  etc.).  This 
application  depends  on  technological  improvements  of  the 
foregoing  memories.  The  major  user  of  the  large  serial 
memories  will  be  data  base  systems.  Therefore,  on-line  ar¬ 
ithmetic  can  provide  instant  processing  capabilities  for 
such  a  data  base  system. 

As  a  final  word,  on-line  arithmetic  is  complementary  to 
other  approaches  that  are  used  to  achieve  concurrency  in  ex¬ 
ecution  of  algorithms.  For  example,  it  can  be  used  in 
minimal-depth  tree-structured  networks.  In  particular,  the 
use  of  on-line  arithmetic  in  non-linear  recurrences  systems 
would  be  advantageous  [ERC  80].  They  are  very  attractive  in 
reconfigurable  networks  because  of  high  modularity  and  sim¬ 
ple  interconnection . 
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CHAPTER  2 


ERROR  CODES 

2.1  General  Remarks  on  Error  Codes 

Computation  without  error  remains  an  illusive  goal  of 
considerable  importance  in  certain  critical  applications 
which  require  sophisticated  and  extensive  computation  with  a 
high  degree  of  system  reliability.  Recent  advances  in  solid 
state  technology  have  provided  individual  devices  with  ex¬ 
ceptional  reliability.  In  some  systems,  this  improvement  in 
device  reliability  has  achieved  sufficient  systems  reliabil¬ 
ity.  However,  in  others,  the  large  number  of  devices  re¬ 
quired  has  negated  the  improvement  in  reliability  at  the 
systems  level.  Such  problems  can  be  solved  by  the  unlikely 
development  of  a  perfect  device  which  never  fails.  In  the 
absence  of  such  a  device,  one  can  expect  greater  use  of  the 
techniques  of  fault- tolerant  computing  to  obtain  improved 
systems  reliability.  Such  improvements  are  not  obtained 
without  degradation  in  performance  or  increase  in  cost  of 
the  equipment,  but  in  many  applications,  this  tradeoff  is 
justifiable. 
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One  of  the  major  approaches  to  fault-tolerant  computing 
is  the  use  of  error  detecting  and  error-correcting  codes.  In 
a  practical  system  there  are  occasional  errors,  and  it  is 
the  purpose  of  codes  to  detect  and,  perhaps,  correct  such 
errors.  These  codes  cannot  correct  every  conceivable  pattern 
of  errors  but  rather  must  be  designed  to  correct  only  the 
most  likely  patterns.  Much  of  coding  theory  has  been  based 
on  the  assumption  that  each  symbol  is  affected  independent¬ 
ly,  so  that  the  probability  of  a  given  pattern  depends  only 
on  the  number  of  errors.  For  example,  codes  have  been 
developed  that  correct  any  pattern  of  t  or  fewer  errors  in  a 
block  of  n  symbols.  Also,  for  those  systems  in  which  errors 
may  occur  in  bursts,  some  special  kind  of  codes  called 
"burst  error  codes"  have  been  devised.  In  the  following  sec¬ 
tion  we  summarize  the  existing  error  codes  with  a  special 
attention  to  arithmetic  error  codes.  We  will  be  using  these 
types  of  codes  throughout  the  rest  of  this  dissertation. 
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2.2  Types  of  Codes 

There  are  two  fundamentally  different  types  of  codes: 
linear  and  non-linear  codes.  Between  these  two  classes  of 
codes#  linear  codes  are  more  important  and#  because  of  this# 
have  a  well  developed  mathematical  theory.  In  our  review  of 
error  codes  we  only  deal  with  a  subset  of  all  linear  codes 
and  therefore  from  now  on  we  restrict  our  attention  only  to 
this  class  of  codes.  Linear  codes  are  in  turn  divided  into 
two  classes:  block  codes  and  tree  codes.  The  encoder  of  a 
block  code  breaks  the  continuous  sequence  of  information  di¬ 
gits  into  k-symbol  sections  or  blocks.  It  then  operates  on 
these  blocks  independently  according  to  the  particular  code 
to  be  employed.  With  each  possible  information  block  (k- 
symbols)  is  associated  an  n-tuple  where  n>k.  The  result#  is 
now  called  a  codeword.  The  quantity  n  is  referred  to  as  the 
code  length  or  block  length. 

The  other  subset  of  linear  codes#  called  a  tree  code# 
operates  on  the  information  sequence  without  breaking  it  up 
into  independent  blocks.  Rather#  the  encoder  for  a  tree 
code  processes  the  information  continuously  and  associates 
each  long  information  sequence  with  a  code  sequence  into  1- 
symbol  blocks#  where  1  is  usually  a  small  number.  Then#  on 
the  basis  of  this  1-tuple  and  the  preceding  information  sym¬ 
bols#  it  emits  an  m-symbol  section  of  the  code  sequence.  The 
name  "tree  code"  stems  from  the  fact  that  the  encoding  rules 
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for  this  type  of  code  are  most  conveniently  described  by 
means  of  a  tree  graph. 


Of  the  two  classes  of  codes,  the  older  block  codes  have 
a  considerably  better  developed  theory.  The  reason  for  this 
seems  to  be  that  block  codes  are  more  closely  related  to  es¬ 
tablished,  relatively  well  understood,  mathematical  struc¬ 
tures.  As  a  result,  considerably  more  research  has  been  done 
on  them  than  on  tree  codes  [PET  72].  Block  codes  are  in  turn 
divided  into  three  basic  subsets:  Cyclic  Codes,  Non-cyclic 
Codes  and  Quasi-Cyclic  Codes.  Among  these  three  categories 
we  are  interested  in  a  subset  of  non-cyclic  codes  which  are 
called  "Arithmetic  Error  Codes" .  These  codes  differ  from 
all  those  previously  stated  in  that  all  operations  are  ordi¬ 
nary  arithmetic.  These  codes  are  practical:  they  can  be  used 
for  data  transmission  with  encoding  and  operations  performed 
by  a  general-purpose  computer  or  they  can  be  used  to  check 
the  operation  of  an  adder.  There  is  an  interesting  similar¬ 
ity  in  structure  between  arithmetic  codes  and  cyclic  codes. 
Residue,  Inverse-residue  and  AN  codes  belong  to  this  class 
of  codes.  Figure  2.1  summarizes  the  relation  among  different 
error-codes  in  a  hierarchical  manner. 
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2^2^  Arithmetic  Error  Codes 

An  arithmetic  code  for  us  is  a  redundant  representation 
of  numbers  having  the  property  that  certain  errors  can  be 
detected  and/or  corrected  in  arithmetic  operations  using 
these  codes.  The  representation  is  redundant  in  that  the 
number  of  digits  used  for  representing  a  number  in  a  coded 
form  may  be  larger  than  the  minimum  number  of  digits  re¬ 
quired  if  no  error  control  is  desired.  The  fundamental  ar¬ 
ithmetic  operation  is  addition.  Therefore,  any  useful  ar¬ 
ithmetic  code  must  at  least  have  the  capability  to  check  ad¬ 
dition.  Preferably,  all  other  elementary  operations,  such  as 
multiplication  and  division,  should  be  checked  as  well. 

To  represent  the  set  of  integers  Z  ={0, 1, . . . . ,m-l } ,  in 

m 

the  radix  r  system,  the  number  of  digits  required,  k,  is  the 
smallest  integer  greater  than  or  equal  to  log^m  .  Instead 
of  using  k  digits,  as  minimally  required  to  represent  Z^  ,  a 
redundant  code  uses  n  digits  for  some  n>k.  This  may  be  in 
the  nature  of  adding  an  extra  n-k  digits  as  checks  to  the 
non-redundant  form  of  k  digits;  or  it  may  be  to  denote  each 
number  N-<Zm  by  a  product  AN  for  some  constant  integer  A. 
Since  these  codes  are  used  in  checking  arithmetic  opera¬ 
tions,  it  is  important  to  define  how  these  operations  are 
carried  out  on  redundant  forms.  Depending  on  how  a  number 
N-<Zm  is  represented  as  an  n-tuple  or  how  arithmetic  is  per¬ 
formed  on  the  codewords,  the  codes  are  classified  as 
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separate  or  non-separate ,  and  as  systematic  or  non- 
systematic . 

Definition  2.1 

An  arithmetic  code,  which  has  each  codeword  represented 
by,  say,  n  digits  is  systematic  if  there  exists  a  set  of  k 
digits  (k<n)  of  the  codeword  representing  the  information 
and  the  remaining  n-k  digits  representing  the  check(s). 

A  systematic  code  may  treat  the  two  parts,  i.e.,  the 
information  digits  and  the  check  digits,  separately  for  the 
purpose  of  addition,  thereby  defining  two  or  more  indepen¬ 
dent  addition  structures,  one  for  the  information  and  the 
others  for  the  checks?  or  it  may  treat  each  codeword  as  a 
single  operand  (or  number)  and  define  uniform  addition  rules 
for  all  n  digits  except  perhaps  for  some  end-around  carries. 
A  systematic  code  of  the  former  type  is  called  separate,  and 
the  latter  none-separate .  A  similar  division  into  separate 
and  nonseparate  classes  can  be  made  for  all  codes.  Based  on 
the  preceding,  arithmetic  codes  fall  into  these  three  major 
classes:  1)  AN  codes  which  are  nonsystematic  and  therefore 
nonseparate?  2)  separate  codes  with  one  or  more  residue 
check,  for  example  (N,  N  mod  A)  residue  code?  and  3)  the 
systematic  subcodes,  which  are  also  called  systematic  non¬ 
separate  codes.  The  AN  codes  were  first  introduced  by  Dia¬ 
mond  [DIA  55],  and  their  detection  and  error  correction  pro¬ 
perties  were  discussed  by  Brown  [BRO  60]  and  Peterson  [PET 


723-  The  separate  codes  using  a  single  residue  check,  such 
as  (N,  N  mod  A)  code  can  only  provide  error  detection  for 
all  arithmetic  related  operations,  but  not  correction  and, 
therefore,  are  of  limited  value  [RAO  72].  In  order  to  ob¬ 
tain  error  correction  by  use  of  separate  codes,  two  or  more 
residue  checks  are  required,  and  that  has  led  to  the  intro¬ 
duction  of  multiple  residue  codes  [AVI  67,  AVI  69,  RAO  70]. 
The  systematic  subcodes  appear  to  have  error  detection 
and/or  correction  properties  similar  to  AN  codes  while 
preserving  the  advantages  of  systematic  codes. 

2^ 2_. 2  Low-Cost  AN  and  Residue  Codes 

In  an  AN  code,  a  given  integer  N  is  represented  by  the 
product  A*N  for  some  suitable  constant  A.  A  is  commonly 
called  the  generator  (and  sometimes  check  modulus)  of  the 
code.  The  search  for  values  of  A  which  have  a  low-cost 
checking  algorithm  identified  the  class  of  low-cost  arith¬ 
metic  codes  which  employ  the  check  moduli  of  the  form 

A=2a-1,  with  integer  a>l.  (2.1) 
"a"  is  called  the  group  length  of  the  code  [AVI  71].  AN 
codes  with  the  check  modulus  2a-l  display  an  exceptional 
adaptability  to  binary  arithmetic  and  have  a  low  cost  check¬ 
ing  algorithm  when  the  lengths  of  the  operands  are  some  mul¬ 
tiple  of  the  check  length  a. 


As  was  mentioned  earlier  in  this  section,  residue  codes 
are  categorized  as  separate  codes.  Indeed,  Peterson  proved 
that  a  separate  code,  meaning  a  code  whose  information  and 
checks  are  separately  processed,  must  be  a  residue  code  [PET 
72].  The  modulo  A  residue  encoding  for  a  number  N  attaches  a 
check  symbol  C(N)  to  form  a  pair  (N,  C(N)).  The  value  of 
C(N )  is: 


C(N)=N  mod  A  (2.2) 
Where  (N  mod  A)-|n|^  designates  the  modulo  A  residue  of  N. 
The  most  significant  differences  of  implementation  between 
AN  and  residue  codes  are  caused  by  the  property  of  separate¬ 
ness.  For  residue  codes,  the  operands  N^  and  N3  and  their 
check  symbols  C(N^),C(N2)  enter  separate  (main  and  check) 
processors  which  produce  the  main  result  N^  and  the  check 
result  C(N^).  The  checking  algorithm  computes  (N^mod  A)  and 
compares  it  to  C(N^).  If  the  values  are  equal,  either  the 
correct  result  has  been  obtained,  or  a  miss  has  occurred. 
Disagreement  indicates  a  fault  in  either  the  main  or  the 
check  processor.  But  for  the  nonseparate  AN  code  the  check¬ 
ing  algorithm  computes  (N^mod  A),  where  N3  is  the  value  of 
the  result.  The  case  (N^mod  A)=0  indicates  either  a  correct 
result  or  a  miss.  Note  that  the  hardware  cost  of  AN  codes  is 
caused  by  the  greater  complexity  of  the  main  processor, 
while  for  residue  codes  it  is  because  of  the  need  for  a 
separate  check  processor.  The  error  detection  and  correc¬ 
tion  properties  of  AN  and  residue  codes  are  considered  in 
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CHAPTER  3 


ERROR  CODED  ON-LINE  ALGORITHMS 

In  this  chapter  we  present  the  main  result  of  this 
thesis.  Our  goal  is  to  develop  a  set  of  error  coded  basic 
algorithms  for  on-line  arithmetic  with  the  help  of  error 
codes  we  defined  in  Chapter  2  of  this  dissertation.  As  was 
mentioned  in  section  1.2.2,  on-line  algorithms  for  the  four 
basic  operations  of  addition/ subtraction,  multiplication  and 
division  have  already  been  devised  and  the  relevant  results 
on  this  subject  can  be  found  in  [ERC  75,  TRI  77,  TRI  78,  IRW 
77,  GOR  80]. 

On-line  algorithms  have  the  property  that  if  an  error 
occurs  at  a  certain  step  of  an  algorithm  and  if  this  error 
is  detected  immediately  after  generation  and  inhibited  from 
spreading  to  the  next  module,  then  the  operation  of  the  fol¬ 
lowing  units  can  be  continued  although  with  less  precision. 
Of  course  the  final  results  have  correspondingly  less  preci¬ 
sion  than  the  original  operands.  This  shows  that  on-line 
algorithms  have  an  intrinsic  property  of  "graceful  degrada¬ 
tion".  Of  course,  if  there  were  some  means  of  error  correc¬ 
tion,  then  this  error  would  not  affect  the  computation  and 
there  would  not  be  any  loss  of  precision.  Our  task  is  to 
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devise  such  algorithms  with  the  capability  of  error  detec¬ 
tion  and/or  error  correction. 

In  order  to  do  this,  we  present  two  different  schemes: 
l)error  detection  with  residue-coded  operands,  2)  error 
detection/correction  with  AN  coded  operands.  Figure  (3.1)  is 
a  general  block  diagram  for  an  on-line  unit  with  residue  en¬ 
coding  . 


Figure  (3.1)-  Block  Diagram  of  A  Residue-Coded  On-Line  Unit 


m 


-i 


Y“>iyir  (3.2) 
These  two  numbers  flow  through  the  MAIN  unit  dig it- by-dig it 
most  significant  digit  first.  The  algorithm  which  is  run  by 
the  MAIN  unit  (we  call  this  algorithm  "MAIN  OP")  is  imposed 
on  the  incoming  operands  and  after  certain  amount  of  delay, 
the  result  Z  appears  at  the  output,  again  digit-by-digit 
starting  with  MSD,  such  that: 


m 

Z=  £  z .  r 
i*l  1 


(3.3) 


At  the  same  time  the  RESIDUE  unit  receives  the  residue 
of  the  corresponding  digits  of  the  MAIN  unit.  We  represent 
this  "residue”  operands  by  X'  and  Y' ,  such  that: 


m 

X '  =  £  x 1  .  r '  -1 
i=l 

m 

Y’=  £  y* .r1-1 
i-1 

The  following  relation  exists 
operands: 


(3.4) 

(3.5) 

between  these  two  sets  of 


x’i*xim°d  A  for  i*l,2,....,m  (3.6) 

y'j^y^mod  A  for  iai,2,....,m  (3.7) 


where  A  is  the  check  modulus  and  was  introduced  in  Chapter 
2.  We  call  the  algorithm  applied  by  RESIDUE  unit  as  "RESI¬ 
DUE  OP".  The  output  of  the  RESIDUE  unit,  with  a  similar 
manner,  is  represented  by  Z’  and  is: 
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(3.8) 


m 

Z  ’  =  Z  z '  -r*""1 
i=l  1 

Notice  that  the  relation  (zV=z^mod  A)  is  not  necessarily 
satisfied . 

After  generation  of  and  z'  i  ,  MAIN  and  RESIDUE  units 
start  working  on  the  next  set  of  inputs.  At  the  same  time  z^ 
and  z'^  along  with  some  other  information  reach  the  CHECK 
unit.  CHECK  unit  operates  with  an  algorithm  we  call  "DETECT 
OP".  This  unit  after  running  the  algorithm  DETECT  OP  on  z^  , 
z'^  and  other  received  information,  decides  whether  these 
results  agree  with  each  other  or  not.  If  the  results  do  not 
agree  then  it  sets  an  error  flag  which  inhibits  all  the 
operations  until  the  source  of  error  is  detected.  For  exam¬ 
ple,  the  current  step  can  be  repeated  by  the  MAIN  and  RESI¬ 
DUE  units  and  if  the  error  still  persists,  the  operation  can 
be  continued  with  less  precision.  It  is  also  possible  to 
correct  this  error  if  we  use  biresidue  codes  instead  of  a 
single  residue  code  [AVI  69,  RAO  70].  In  this  thesis  we  do 
not  address  the  problem  of  error  correction  by  biresidue 
codes . 

In  the  second  scheme  the  operands  X  and  Y  are  encoded 
with  AN  codes.  Encoding  is  done  by  simply  multiplying  each 
digit  of  X  and  Y  by  a  check  modulus(A).  Denote  these  encoded 
operands  by  X'  and  Y*  respectively.  Therefore: 


0 
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X 1 -A*X 


Y ' =A*Y 

such  that: 


x* .*A*x. 

i  i 

for 

i-1,2 

y’ . =A*y. 

l  i 

for 

i»l,  2 

The  algorithm  which  operates  of  X*  and  Y*  is  the  same 
as  that  for  residue  encoded  operands  (Algorithm  MAIN  OP). 
The  output  digit  selection  process  in  this  algorithm  should 
be  such  that  the  correct  output  digit  (zV)  is  divisible  by 
A.  Therefore  each  single  digit  of  the  encoded  operands  and 
the  results  can  be  checked  for  divisibility  by  A.  If  any  of 
these  digits  is  not  divisible  by  A,  then  it  does  not  belong 
to  the  correct  digit  set  and  an  error  has  occurred.  The 
overall  organization  of  the  AN  coded  on-line  unit  is  shown 
in  Figure  3.2. 

In  this  case  we  only  need  one  MAIN  Unit  and  the 
corresponding  CHECK  Unit  which  tests  the  operands  and  the 
results  for  divisibility  by  A. 

This  method  has  the  following  advantage  over  the  resi¬ 
due  encoding.  If  A  is  chosen  appropriately  then  error 
correction  is  also  possible  in  this  case.  We  briefly  men¬ 
tioned  that  in  order  to  correct  single  errors  in  the  residue 
scheme  we  have  to  use  biresidue  codes  instead  of  a  single 
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E 


Figure  (3.2)-  Block  Diagram  of  an  AN-Coded  On-Line  Unit 


residue  code.  Note  that  the  hardware  cost  of  AN  codes  is  in 
the  greater  complexity  of  the  MAIN  processor,  while  for 
residue  codes  it  is  in  the  separate  CHECK  processor. 
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3 . 1  Error-Coded  On-Line  Division 


Residue  Coded  Operands 

Assume  that  the  dividend  N  and  the  divisor  D  are 
represented  by  m  digits  in  a  radix-r  redundant  number  sys¬ 
tem,  Also  assume  that  the  residue  of  each  digit  with 
respect  to  a  constant  A  (A*2a-1)  is  attached  to  it  and  is 
transferred  to  the  on-line  DIVIDE  unit.  Therefore  the  coded 
operands  are: 


(N,  N'  )*.(nl# 
(D,  D '  )».(dl# 


n' x) (n2,n' 2) , . . . 

’  • * • ^ nm,n 

V 

d*1)(d2,d'2),... 

‘  *  *  * (dm,d' 

'«> 

(Q,lQlA)».(qi,lqilA)(q2.|q2lA) . <<V><lmlA) 

ni»  d^  and  belong  to  the  following  symetric  signed-digit 
sets  [AVI  61]: 


n^«{ “p  i • • « * -1 r Of  1/ « • • rp  }  (r— 1) ^ r / 2  (3.9) 
d^f-p  , . . ,-1,0, 1, . . ,p  }  (r-l)>p  >r/2  (3.10) 


q^-*  { — p ,  •  •  •  ,  —  1  ,  Of  1  ,  •  •  •  ,  p }  (  r— 1  p  >_r/2  (3.11) 

The  algorithm  "MAIN  DIVIDE"  which  is  run  by  the  MAIN  Unit  is 
shown  in  the  next  page. 
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Algorithm  "MAIN  DIVIDE 


Step  1  [Initialization]: 


S. 


max 


P0*  nir 
u  i-1  1 


-l 


8 

max 

v  >.  V 

1=1 


-1 


Qo“° 


For  ^*1 i2i**i*f m  DO : 
Step  2  [Selection]: 

q^SELECT  ( rP ^  ,  Dj_1 ) 

Qj“Qj-i+qjr~3 
Step  3  [Input  Digits]: 


Di*Dj-l+dj+S  r 

J  J  J  max 

Step  4  [Basic  Recursion]: 


-j-s, 


max 


I 

? 


The  algorithm  run  by  the  RESIDUE  unit  is  similar  to 
this  and  is  named  "RESIDUE  DIVIDE  . 

Algorithm  "RESIDUE  DIVIDE" 

Step  1  [Initialization]: 

®max  • 

p*o«  ^  nV 

u  i=i 

®max  . 

D'os 

i=l 


Q*0*° 


For  j=l, 2, . . . . ,m  Do: 
Step  2  [Selection]: 


q‘  ^SELECT  ( r '  P '  j-!*0’  j-l > 


Step  3  [Input  Digits]: 


D'j-DYitdV*wr 

Step  4  [Basic  Recursion]: 


-3-8, 


max 


p,j-,p,j-l-q,3D,3+n,3+8ma/ 


-8 

,  max 


-Q’  •  ,  d'  • . e  r 
3-1  D+Smax 


-8 

,  max 


Step  5  [End  Do] 


(3.1 
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r*  is  the  radix  of  the  RESIDUE  unit  which  should  be  as 

small  as  possible •  §  is  the  maximum  of  the  on-line  de- 

max 

lays  required  by  the  MAIN  and  the  RESIDUE  Units  and  will  be 
defined  later.  Also  n'^  and  d'^  are  defined  as  was  ex¬ 
plained  before: 


n' ^=n^mod 

A 

for 

i*l ,  2 

d 1 . =d . mod 

l  l 

A 

for 

i«l,  2 

Therefore, 


|qilx«{0,l,2, - -  ( A-l )  }  (3.14) 

The  output  of  the  RESIDUE  Unit  which  is  the  quotient  of 
residues  are  assumed  to  be  in  the  following  set: 

q  ’  i‘<  { -p ,...,-1,0,1,  .  .  . ,  p 1  }  (r1  -1  )>p'  >^r'  /2  (3.15) 

In  what  follows  we  prove  that  the  Algorithm  "MAIN 
DIVIDE"  and  similarly  the  Algorithm  "RESIDUE  DIVIDE"  con¬ 
verge  to  the  correct  value  of  the  quotient. 


Proof  of  Convergence 

By  induction  on  j  in  the  basic  recursion  formula(Eq. 
3.12)  we  get: 


§  1+S  c 

m_ax  max  -8max 

D=1  P  ,r  i.  n.r  -qi  d.r  +n1+g  r 

1*1  1*1  max 
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1+8  1+8 
jnax  max 

*r  i.  nir“1-q1  Z  d.r-1 

i=l  i=l 


j-2 


1+8 

3.  3ax_  -i 


1+8 


max 


2+8 


max 


r  i_  n.r_1-rq.  i.  d.r-:L-q0  d.r 

i=l  1  1  i=l  1  *  i=l  1 


-6  -8 

t  _  _  max  -1  ,  _  max 

+n2+8  r  qlr  d2+8  r 

max  max 


2+8 


2+8 


max 


-  jriax  _ .  __ 

=r  Z  n .  r_1-  ( rq,  +q0 )  i.  d.r-1 
i=l  1  z  i=l  1 

Continuing  this  procedure  we  obtain  as  follows: 

.  ^+^max  ^ .  .[  2  -l[^+^?ax 

P  .=r^  n.r“'1-r-3!  *-q.r~1||  d.r“: 

3  i-1  1  U-1  1  Jl  i-1  1 


(3.16) 


If  j=m  thens 


m  f  m  «1  T  m  -1 

?-'r  A"^'1  ■ 


or 


r~mP  =N-Q*D 
m 

From  this  equation  Q  is  obtained: 

p 

Q=D  "  r  “D  (3*17) 

Therefore,  by  devising  a  quotient  digit  selection  procedure, 

SELECT  in  step  4  of  the  Algorithm  "MAIN  DIVIDE"  such  that 


IPJ  <D 
m 

N 

the  quotient  can  be  computed  to  m  digits  of  precision. 


A  similar  proof  is  valid  for  the  RESIDUE  Unit: 


.  ^+®max  .  . f  j  .1 

A  n'ir'  -r°L^'ir’  j 


J+S, 


max 


^  d*  .r 
i=l 


P1 


»  -i 


q <  ^N 1  _  $  — m  m 

*  r\  '  L  r\ » 


(3,18) 


(3.19) 


N  * 

and  assuming  |P'  l<D*  then  to  m  digits  of  precision. 


T^e  Error  Detection  Algorithm 


The  purpose  of  this  section  is  to  find  an  algorithm 
that  can  detect  an  error  at  each  step  of  the  on-line  divi¬ 
sion  process.  This  algorithm  is  run  after  generation  of 
and  qV  by  the  MAIN  and  RESIDUE  Units  and  will  determine 
whether  these  quotient  digits  are  correct  or  not.  If  an  er¬ 
ror  is  detected  then  the  current  step  is  repeated,  otherwise 
the  division  process  proceeds  as  usual. 


From  Equations  (3.16)  and  (3.18)  we  haves 


and 


By  dividing  these  two  equations  and  getting  the  residues  of 


both  sides  and  also  assuming: 
lrlA*lrM\-1 

we  get  the  following  equation: 


1  j  1 

j+g  1 

J  max  | 

1  1 

qij* 
lli=1  1  l 

J I  j  1  j 

1  1 

1  I 

1  E  s'i1* 
j  1  i=l  1 

|  Z  nt,-|P  1 

|  l=!  |  J 

1  1 

1  1 

1  1 

In  the  equation  above: 


(3.20) 


(3.21) 
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Algorithm  "DETECT  DIVIDE" 


Step  1  [Initialization]: 


Vx'o-° 


V5'  ni  *  h 

1-1 


Y’o=l  .^"'i'.A 


For  j=l, 2, . * • . ,m  Do: 


Step  2  [Input  Digits]: 


x'j-|x'j-l^,3lx 


Y  j“ |Y j-i+nj+S  'r 

J  J  J  max 


P^-and  P'  j 


Step  3  [Check  for  Error]: 


Zj'lXj*<Y’j-lP'jl\>lR 

z'j*lx'j*(Yrl!>ilR,lR 


If  (Z^Z’j)  E»l,  GOTO  ERROR  SUBROUTINE 
Step  4  [End  Do] 


j  ■ 


3. 1.1. 2  Determination  of  §, 


max 


In  order  to  have  overlap  between  the  adjacent  selection 
regions  of  the  Pj-Dj  Plot#  the  minimum  index  difference  (8) 
for  the  case  of  redundant  dividend  and  divisor  is  found  to 
be  [GOR  80]: 


I  r(2k-l)(l-k  )| 


(3.23) 


II  III 

k,k  and  k  are  defined  as: 


k— e_ 

r-l 


r-l 


x,,,=e 


r-l 


Since  division  in  the  RESIDUE  Unit  is  performed  with 
non-redundant  operands,  we  get  [GOR  80]: 

§ '  =  |*2  -  log  r ,  (3.24) 

_  I 

where  k 1  =  \  ■ .  We  define  §  to  be  the  maximum  of 

r-l  max 

§  and  8' . 


8maxsMAX(8'8') 


(3.25) 


3. 1.1. 3  Radix  of  The  RESIDUE  Unit 


As  was  mentioned  earlier,  the  radix  of  the  RESIDUE  Unit 
is  an  important  factor  in  the  design  of  the  error-coded  un¬ 
its.  Because,  as  r'  increases  the  amount  of  hardware  needed 


for  the  RESIDUE  Unit  increases.  In  the  extreme  case  where 


rar‘  then  the  detection  process  is  merely  duplication  of  the 
MAIN  Unit.  On  the  other  hand  there  are  some  lower  bounds  for 
r'  that  should  be  met.  These  bounds  are  calculated  as  fol¬ 
lows  : 


Since  residue  digits  (nV,dV)  are  assumed  to  be  in  ra¬ 
dix  r'  number  system  we  have: 

n' i*d' i<r'-l 

using  Eq.  (3.14)  we  get: 

A-l^r'-l  or  r‘>A  (3.26) 

from  (3.20)  and  (3.26)  we  obtain: 

r'=M’A+l  for  M'=l,2, _  (3.27) 

and  if  we  assume  that  A  is  a  low-cost  modulus  (A^^-l)  then: 

r'=M' 2®-M'+l  for  M'=l,2,...  (3.28) 

also  from  (3.20)  we  get: 

r=MA+l  for  M-1,2, ...  (3.29) 

3^. 1^.1^. 4  An  Example  of  The  Error  Detection  Process 

The  following  is  a  numerical  example  of  the  error- 

detection  process  when  residue-coded  operands  are  used. 


Assume: 


\ 
f 

Using  the  selection  diagram  shown  in  Figure  (3.3)  and 
the  Algorithm  MAIN  DIVIDE,  the  following  table  is  obtained 
for  the  operation  of  the  MAIN  Unit  (Table  3.1). 


j 

1 

1 .1266 

1.2 

,X  _____ 

1 

-+- 
1  . 

,h... 

54977 

-+- 

1 

-+- 

1 

.x* 

_ 

2 

2 

1 .16646 

1  .23 

.J.  _  _ 

1. 

_x_ 

549774 

1 

_X— 

3 

3 

1 .015298 
^  _ 

T 

1  .230 
“+  — 

1 . 

-X- 

549774 

1 

„.x_ 

0 

4 

1 .15298 

1 .2303 

.  _i_  _ 

i . 

549774 

T 

1 

_4__ 

3 

5 

I-. 119522 

M  — 

’T 

!  | .23032 

.  X  _ 

1  . 
,x« 

549774 

1 

2 

6 

I-. 095672 
- + - 

'T 

:  I  .230322 
’  + - 

1 . 
-+- 

549774 

1 

-+- 

2 

Table  (3.1)-  Results  Obtained  by  the  MAIN  Unit  (EX,  3. 1.1.4) 


According  to  this  table: 


Q=Q6=. 230322 


Figure  (3.4)  shows  the  selection  diagram  for  the  RESI¬ 


DUE  Unit. 


F»f«r«  3.4  -  Sanction  Whw*  for  ttm  RESIDUE  Unit 
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Similar  to  Table  (3.1),  Table  (3.2)  shows  the  results 
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1 
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T  T 

1  check  1 

X  X 
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1 
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1 

2 
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0 

1 
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1 

m  m  l  m 
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1 
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__X- 

0 

-  1  —  "T 
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X  X 

1 

X  — 

5 

1 

0 

T“ 
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._x_ 

0 

1 

_X- 

1 

1 

..X. 

1 

1 

_x. 

0 

1 

__x« 

0 

1  check  1 

X  X 

1 

+- 

6 

T 

i 

— +- 

1 

T" 

1 

— +- 

0 

T* 

1 

-+- 

1 

T  " 

1 

— +- 

1 

“  l  " 

1 

-+- 

0 

1 

— +- 

0 

1  check  1 
-+ - + 

Table  (3.3)-  Results  Obtained  by  the  CHECK  Unit. 


Now  assume  that  at  step  3  of  the  'MAIN  DIVIDE'  Algo¬ 
rithm  an  error  in  the  Multi-Input  Redundant  Adder  causes  the 
partial  remainder  P3  (=0.15298)  to  be  incorrect.  Assume  this 
wrong  result  (P^)  is: 

P*=0. 15296 

Continuing  the  algorithm  "DETECT  DIVIDE"  from  step  3  we 

get: 

j=3 

X3»|2+°Ia»2 
X'3=|l-2 Ia=2 

Y3“,1+0  * a"1 


Y*3-ll+OlA=l 
Z3»l2*(l-1) Ia-0 

z  *  3- 1 2* ( 1 -2 ) I A*1 

Since  Z3^Z'3  this  error  will  be  detected  by  the  CHECK 

Unit . 

When  no  error  is  detected  by  the  CHECK  unit,  the 
current  quotient  is  delivered  to  the  next  on-line  unit 

along  with  its  residue  modulo  A  [notice  that  (q^  mod  A)  is 
not  necessarily  equal  to  q' ^  3.  These  two  constitute  one  of 
the  operands  of  the  following  on-line  unit. 
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3^1_.2^  AN  Coded  Operands 

In  the  previous  section  we  discussed  the  detection  of 
errors  in  an  on-line  divide  unit  when  the  dividend  and  the 
divisor  are  residue  coded.  As  was  mentioned  before,  it  is 
possible  to  use  AN  coded  operands  for  the  purpose  of  error 
detection  and/or  error  correction.  In  this  section  we 
present  a  summary  of  the  proposed  algorithms  when  AN  coded 
operands  are  used. 

Again  we  denote  the  operands  by  N,  D  and  Q  for  the 
dividend,  divisor  and  the  quotient.  Encoded  operands  are  ob¬ 
tained  by  simply  multiplying  each  digit  of  the  N,  D  and  Q  by 
the  check  modulus  A.  Denote  these  encoded  operands  by  N',  D' 
and  Q'.  The  table  of  the  next  page  shows  the  correspondence 
between  two  sets  of  operands  and  the  results.  For  the  reason 
which  will  be  explained  later,  the  digits  of  the  dividend 
(N)  are  multiplied  by  A^  instead  of  A  [AVI  73]. 
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P'.-Al'-a'O’]  or  Q’=§; 
Now  if  we  prove  that 


The  quotient  Q' 


IP'  I <AD' 
m 

_N  1 


to  m  digits  of  precision, 


(3.36) 


Q '  = 


N 1 
D' 


2 

A_N 

AD 


(3,37) 


which  is  the  correct  result  and  shows  the  reason  why  we  have 
.  2 

to  multiply  n^  by  A  instead  of  A, 


Selection  of  The  Quotient  Digits 

One  of  the  most  important  factors  in  the  design  of  the 
AN-coded  division  unit  is  the  selection  procedure.  As  was 
mentioned  before,  selection  is  such  that  the  correct  quo¬ 
tient  is  a  multiple  of  the  check  modulus(A).  With  the  help 
of  the  basic  recursion  formula  (3.12)  and  following  the  pro¬ 
cedure  given  in  [GOR  80]  the  bounds  on  partial  remainder  are 
obtained : 

kAD*  _.-A2(k’  +kk’  )r_S>  P’  .  _>-kAD’  ,.+A2(k’  +kk'  )r~8  (3.38) 

By  letting  j=m  in  (3.38)  we  get: 

AD ' > | P  '  | > -AD ' 

—  m  — 

Therefore,  Eq.  (3.36)  is  satisfied  and  Q'  is  indeed  the 
correct  quotient  up  to  m  digits.  Also  following  the  pro¬ 
cedure  given  in  [GOR  80]  we  get  the  following  set  of  selec¬ 
tion  equations: 
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(3.39) 


(i'+Ak)D' j-A  (k  +kk* ) 


•)r"8+1> 


rP'j  >_(i'-Ak)D' .. 


9  1 •  _K+1 

+A  (k  +kk* ) r 


This  condition  can  be  graphically  described  by  means  of 
a  P'j-DV  plot  [ATK  68].  It  consists  of  a  family  of  curves 
which  are  linear  function  of  D'^  with  q' ^  as  parameter  rang¬ 
ing  from  -Ap  to  +Ap  in  steps  of  A.  The  area  between  maximum 


rP' .  and  the  minimum  rP' .  will  be  denoted  the  q' .=i'  region. 
3  3  3 

A  given  value  of  D'^  and  rP1 .  will  correspond  to  a  point  in 
an  i'-  selection  region.  The  quotient  digit  q' j  is,  there¬ 
fore,  i'  and  is  used  in  forming  the  next  partial  remainder. 
Figure  (3.5)  is  an  example  of  a  full  P'j-D'j  plot  with  r=2, 
k=k ' =k '  =1,  A=3  and  S=4. 
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3. 1.2. 2  Determining  The  Minimum  Index  Difference 


The  minimum  allowable  value  of  8  can  be  determined  by 
requiring  that  the  lower  bound  of  a  q' i *  selection  region 
and  the  upper  bound  for  the  corresponding  q'j=i'-A  selection 
regions  intersect  at  the  minimum  value  of  D'.  Therefore,  by 
using  Eq.  (3.39)  we  get: 


_-§+l .  2k -1 

r  < - n — 


- ^ - (D*  )  . 

2A(k  +kk' )  3 


(3.40) 


From  Eq.  (3.33)  assuming  m— *oo  minimum  value  of  D* ^  is  found 
to  be; 

(D  *  . )  *  =*A-~*  - 

x  r  / 

inserting  this  into  Eq.  (3.40)  the  worst  case  §  is  founds 


2  (k  +kk*  ) 


(3.41) 


By  referring  to  [GOR  80]  we  find  that  this  §  is  exactly  the 
same  as  that  found  for  ordinary  operands.  Therefore  the  pro¬ 
posed  encoding  does  not  change  the  minimum  delay  required. 


3.1.2.3  An  Example  of  Division  With  AN-Coded  Operands 


Assume: 


r-2,  k=k' =k  =1  and  A=3 


Also  we  assume  that  D  is  normalized  (not  pseudonormal- 
ized) .  Therefore: 
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r 


‘“’j'min-1-5 


inserting  this  in  (3.40)  results  in  S>4.  Assume: 


8*4 

from  Equation  (3.30),  (3.31)  and  (3.32)  we  get: 

f 

q‘  j  «  {3,0,3} 
n’  j  <  {9,0,9} 
d 1  •  «  {3,0,3} 

k  J 

and  al so : 


3> 

D’ 

>| 

3  > 

Q* 

—2 

j  i'3 

9> 

N’ 

>-9 

Plugging  the  given  values  in  Eq.  (3.39)  the  selection 
regions  are  obtained: 

-3  i=i’  _3 

( i '  +3  )D '  ^-18*2  *1  (2P’  .)  D  >_(i'-3)D' ^+18*2  3 

where  i‘  •*  {3,0,3} 

Figure  (3.5)  shows  the  selection  regions  obtained  from 
this  set  of  inequalities.  Now  assume  the  dividend  and  the 
divisor  are: 


N'«. 990999 


D'*.33l303 


Following  the  “MAIN  DIVIDE"  algorithm  shown  in  Section 


3.1.1  of  this  thesis,  Table  (3.4)  will  result. 


j 

!  'Vi 

-+- 

1 

1 

_X.-_ 

1 

1 

D‘j 

X. 

1 

1  . 9909 

.J.  __  _ 

1. 

3 

1. 

.X. 

33330 

1 

_x. 

3 

2 

1  .0 
_X. 

1. 
_  X  — 

30 

1. 

_x« 

333303 

1 

_x. 

0 

3 

1  . 0005? 

1  . 

300 

1. 

_X- 

333303 

1 

0 

4 

•T  **’"  *"  — — 

1 .0099 
_  _ 

“T  — 

1  . 

3003 

1. 

333303 

1 

.X. 

3 

5 

"T  — L 

1  .090909 

_.X 

“T“ 

1. 

_  J,— 

30033 

T  * 

1. 

„X- 

333303 

1 

.X. 

3 

6 

1 . 000099 

“T“ 

1  . 

_X_ 

300330 

T 

1. 

333303 

T 

1 

•  4-. 

0 

7 

■T“*  — 1 

1 . 00099 
-+ - 

“T" 

1 

-+- 

- 

“T 

1 

-+- 

“ 

•T 

1 

-+- 

- 

Table  (3.4)-  An  Example  of  AN-Coded  Division 


According  to  this  table: 

Q,*Q,6=(300330)2 

By  looking  at  columns  two  and  three  of  the  above  table, 
it  can  be  confirmed  that  all  the  digits  of  and  Q* j  are 

multiples  of  the  check  modulus  (A=3).  Therefore,  the  neces¬ 
sary  condition  for  the  correctness  of  the  division  process 


is  satisfied 


r 


3.2  Error-Coded  On-Line  Multiplication 

Residue  Coded  Operands 

Assume  that  the  multiplicand  (X)  and  the  multiplier  (Y) 
are  represented  by  m  digits  in  a  radix-r  redundant  number 
system.  Also  assume  that  the  residue  of  each  digit  with 
respect  to  a  constant  A  (A=2a-1)  is  attached  to  it  and  is 
sent  to  the  on-line  multiplication  unit.  Therefore  the  coded 
operands  ares 

(X,X')-(x1,x11)(x2,x*2)...(xm,x'm) 

(Y,y)=(y1,y,1)(y2,y,2)...(ym,y(m) 

The  product  R  is  also  represented  by  an  m  digit  radix-r 
redundant  number.  The  residue  of  each  product  digit  is  also 
attached  to  it  while  leaving  the  multiplication  unit. 

(R,  lRlA)=(Pl,  IPj.lfcHPj.  IP2lA)--*(pm'  ,Pm,A) 

Since  X  and  Y  are  assumed  to  be  redundant,  x^  and  y^ 
belong  to  the  following  digit  set: 

II  _  II 

xi#  y^  .  { "p  , « , • , 1,0, 1, « • • ,p  }  (3.42) 

X1  and  Y*  are  not  redundant,  therefore: 

X’i'  y’i'  lpi'\  4  {0,1 . (A-l)}  (3.43) 

Relation  (3.43)  is  obtained  from  the  definition  of  the  resi¬ 
due  function. 
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X  and  Y  are  assumed  to  be  bounded  by  a  positive  con¬ 
stant  M  such  that: 

Mi  x'  Y  i"M  (3.44) 

and  similarly: 

M’ >_  X'  ,  Y‘  >0  (3.45) 

The  operands  pass  through  a  MAIN  Unit  which  performs 
the  algorithm  "MULT"  given  in  Appendix  C.  The  result  R  is 
also  in  a  redundant  number  system  such  that: 

m 

R-  Z  p. r-1 
i*l  1 

and  belongs  to  the  following  digit  set: 

p  ^  ^  f  «  •  •  •  i  If  Or  1#  t  «  •  •  fp)  (3.46) 

•  i 

note  that  p  and  p  may  be  different. 

The  residue?  digits  pass  through  a  RESIDUE  Multiplica¬ 
tion  Unit.  The  same  algorithm  (MULT)  operates  on  them,  that 
is,  they  are  multiplied  in  an  on-line  mode.  The  product  of 
the  residues  will  be  designated  by  R*  and  is  defined  as: 

m 

R‘«  Z  p' . r ' 
i«l  1 

and  p* .  belongs  to  the  following  set: 

p  ^  ^  t  — p 1 »•••»!, 0,1, •••,p)  (3.47) 


note  that  even  though: 


r 


x'i“lxi,A  and  y'i-'yiU 

but  pV  may  not  be  equal  to  I  I A . 

Proof  of  convergence  of  the  algorithm  MULT  for  the  MAIN 
and  RESIDUE  operands  is  the  same  and  is  given  in  Appendix  C. 


A*2,*i*i  The  Error  Detection  Algorithm 

The  purpose  of  this  section  is  to  develop  an  algorithm 
that  can  detect  an  error  at  each  step  of  the  on-line  multi¬ 
plication  unit.  This  algorithm  will  be  run  after  generation 
of  p^  and  p'^  and  will  determine  whether  these  product  di¬ 
gits  are  legal  or  not. 

To  derive  this  algorithm,  from  Eq.  (C.8)  in  Appendix  C, 
we  have; 


r"3P  .=X -Y  .-R. 


(3.48) 


j  r j  "j-l 

Following  a  similar  procedure  as  that  given  in  Appendix 


C,  for  the  RESIDUE  Unit  we  get: 


r' “3P’  .*X’  .Y'  .-R 1 *  . 


D  j  j  j“l 

where  X^,  Y ^ ,  X'j,  Y'j'  Rj-1  and  R*  j-1  are  ^e^^-ned  below: 

1  -i  1  _i 

X.»  i.  x.r  and  Y.=  i_  y.r 

3  i»l  3  i*l 


(3.49) 


X’ 


J.  _i 

r-  „ «  „  i  i 


j»  l  x'ir1  and  Y'  .*  y'.r 
3  i-1  3  i-1 


,  -i 


1 


j-1  _ .  j-1 

R.i  i_  p.r’1  and  R'  .  ,=  i.  p'  .r‘ 
J  i=l  J  i=l  1 


Taking  the  residues  of  X^,  X'j,  Yj  and  Y' ^  with  respect  to  A 
we  get: 


I  j  i\ 

|XJB  =  |  Z  x'.lrl"1! 


3  A  li=l 


|x’ jlA=|  ^  x' i I r' I ^ ! 

3  A  li=l  1  A  lA 
if  we  assume  |rlA=|r’ lA  then: 


|X3IA**X‘ j'a 


and  similarly: 


i^'a-i^Ia 


Rearranging  Equations  (3-48)  and  (3-49)  we  obtain; 


r~^P .+R .  , =X  . Y . 


‘  j  j“l 

r ’ ~^P ' j+R*  jY * .. 


Taking  the  residues  of  both  sides  with  respect  to  A: 

|r-J|Pjl^lRj.ilR!A-l|xjli*lYjlA|A 

,x' j'a*'v-3'a|x 


Jr,-JlP'jlA+lR,j_1lAjA-|lX,:jlA*IVjlA|A 


Therefore  s 


This  is  the  relation  that  we  check  at  each  step  of  the 
multiplication  algorithm  for  correctness  of  the  MAIN  and 
RESIDUE  Units.  To  simplify  the  checking  process  we  assume: 


*r*A~*r  * A”1 

Therefore,  (3.50)  reduces  to: 


(3.51) 


lip  I  +  Ir  I  I  =l|p'  I  +|r'  I  I 
|  1  j'A  ,Kj-l 'a|A  |'  j'A  ,K  j-1  A  |  A 


(3.52) 


The  algorithm  of  the  next  page  is  run  by  the  CHECK 
Unit.  The  inputs  of  this  unit  are  (P^,  P j-i ^  from  the  MAIN 
and  (p’j*  from  the  RESIDUE  Units. 


Algorithm  ” DETECT  MULT" 


Step  1  [Initialization] : 


R_,*0,  pn=0 


R’_l*0'  P' 0=0 


For  j=l , 2, . . . . ,m+l  Dos 


Step  2  [Input  Digits]: 


Rj-l-lRj-2+pj-l *A 
R*  j-l=,R' j-2+P’ j-1 *A 


Pj  and  P' j 


Step  3  [Check  for  Error] : 


*rilpjU+Rj-i!» 

z':=!lP'3lA+R,i-i|lA 


If  (Z^Z’j)  E=1 ,  GOTO  ERROR  SUB 


Step  4  [End  Do] 
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3. 2.1. 2  Radix  of  the  RESIDUE  Unit 

The  bounds  on  the  radix  of  the  RESIDUE  Multiplication 
Unit  (r* )  is  similar  to  that  derived  for  the  RESIDUE  Divi¬ 
sion  Unit.  For  the  corresponding  formulas  see  Section 
(3. 1.1. 3). 


3^.  2^.1^ ._3  Btounds  On  Operands 

According  to  (3.44)  and  (3.45)  we  have: 


Mi  x'  Y 

M"  >  X1  ,  Y*  >0 


Since  the  operands  of  the  MAIN  Unit  are  assumed  to  be 
redundant,  from  Eq .  (C.22)  in  Appendix  C  we  haves 


M1 


2k -1 
- nr 

4k 


(3.53) 


The  case  of  non-redundant  operand  multiplication  has 
not  been  addressed  in  Appendix  C  of  this  thesis.  But,  for 
this  case  with  a  similar  derivation  the  following  equation 
has  been  obtained: 

M*  <_  k*-^-  (3.54) 


Note: 


After  adjusting  the  operands  of  the  MAIN  Unit,  if 
X*  and  Y*  are  still  out  of  bounds,  multiples  of  the  check 


constant  (A)  can  be  added  to  or  subtracted  from  the  digits 
of  X'  and  Y'  without  changing  the  results. 


3.2.1.4  An  Example  of  The  Error-Detection  Process 

The  following  is  a  numerical  example  of  the  error 
detection  process  when  residue-coded  operands  are  used: 
Assume : 

f 

r=10,  p=5,  p  =9 
'  r 1 =4 ,  p 1 =2 
A=3 

From  (3*42) ff  (3.46),  (3.43)  and  (3.47)  we  get: 

t 

xi#  4  {9 , 8, . . . , 1 , 0, 1, . . . , 8, 9 } 

P^  4  (5r4r«*i/l«0fl| •••|4r5) 

xV,  y' L  4  {0,1,2} 

p'.  4  {2,1,0, 1,2} 

Therefore: 

5  '  '  2 

k=J,  k  =1  and  k' =| 

From  (3.53)  and  (3.54)  we  get: 

M£0.028  and  M' <0.167 

Assume  M=0.01  and  M' =0.167.  The  operands  and  their  residues 
are  assumed  to  be: 
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(X,X*  )«.  (0,0)  (0,0)  (9,0)  (4, 2)  (6,0)  (8.1)  (9,0) 

(Y.Y1 )». (0,0) (0,0) (?,0) (7,1) (2,1) (9,0) (6,0) 

Using  Eq.  (C.21)  in  Appendix  C,  the  following  P-P  plot 
for  the  selection  of  the  product  digits  of  the  MAIN  Unit 
will  be  obtained. 
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3.6 


the  Algorithm  "MULT"  given  in  Appendix  C,  Table  (3.5)  i 
tained  for  the  operation  of  the  MAIN  Unit. 
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1 

Table  (3.5)-  Results  Obtained  by  the  MAIN  Unit  (EX.  3. 2. 1.4) 


A1  so: 

P8=P?-p7=-0. 1944364 
R=0 . 0001321 

Therefore: 

X*Y  *  0. 0001321 2144444 
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Unit.  Table  (3.6)  shows  the  results  obtained  by  the  RESIDUE 
Unit . 

Prom  this  table  we  get: 

p ’ q^P ’ 7“P ’ 7= (° .1^1 1000 ) 4 

Therefore : 

R'=. 0000001 

and 

X ’ *Y ’ =0 . 00000011211000 

Table  (3.7)  is  obtained  by  using  the  Algorithm  "DETECT 
MULT"  and  summarizes  the  operation  of  the  CHECK  Unit  for 
this  example. 

Since  for  j=l,2,...,m,  m+1  then  all  the  opera¬ 

tions  have  been  correct  or  an  undetectable  error  has  oc¬ 
curred  . 

Now  assume  at  step  8  of  the  RESIDUE  MULT  Algorithm  an 

error  changes  the  sign  of  the  eighth  partial  product 

_  * 

[P'g»(0. 1211000)4].  The  incorrect  partial  product  (P'g)  will 

be: 

p,g-(0.12lT000)4 

Continuing  "DETECT  MULT"  from  step  8  we  get: 


j 

-+- 

1 

1 

x'j 

.  + - 

1  *'j 

-  + - 

I 

-+  -r  -r 

1 

1. 

0 

1  .0 

-r 

1.0 

-4-  __ 

i 

.X. 

0 

2 

1. 

0 

1 .0 

,4-  _  _ 

1  .0 

.x  __ 

1 

0 

3 

1  . 

0 

1  .0 

_x.  _  _  __  _ 

1  .0 

.4.... 

1 

0 

4 

“T“ 

1  . 

0002 

1"l-—  * 

1 .0001 
„  _x  _  «. 

*T  - 

1  .0002 

1 

_x. 

0 

5 

1  . 
.X- 

00020 

T  J 

1 .00011 
_  x.  _  _ 

*T - ■ 

1 .0022 

-X. 

1 

^x. 

0 

6 

1  . 

000201 

*T - - 

I  .000110 

1 .02211 

"“T“ 

1 

•X. 

0 

7 

1  . 
-+- 

00020101 .00011001 .2211 
- + - + - 

T 

1 

-+- 

1 

Table  (3.6)-  Results  Obtained  by  the  RESIDUE  Unit 


j-8 


R?= lo+i lA=i 


Z8.|2+llA-0 


Z'8*,1+1|A'2 


Since  Zg/Z'g  this  error  will  be  detected  by  the  CHECK 


Unit. 


If  no  error  is  detected  by  the  CHECK  Unit,  the  current 
product  digit  (p^)  is  delivered  to  the  next  on-line  unit 
along  with  its  residue  modulo  A.  These  two  constitute  one  of 
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3  lRi_i  zi  I  Z*. 

I  |  D-i  |  3  I  3 


*  IZj-Z-.l 


1 

•+- 

1 

_4-_ 

0 

-+- 

1 

0 

-+- 

1 

_X« 

0 

-+- 

1 

X. 

0 

-+ - 

1  check 

x 

2 

1 

_X_ 

0 

1 

_X_ 

0 

1 

_x— 

0 

1 

_  x  — 

0 

X  *  — ' * 

1  check 

X 

3 

1 

_X  — 

0 

1 

.X. 

0 

1 

.1. 

0 

"T“ 

1 

X 

0 

T  -1- 

1  check 

x 

4 

1 

0 

1 

-X. 

0 

1 

mXm 

2 

1 

_x_ 

2 

1  check 

_x 

5 

"T“ 

1 

_X_ 

2 

1 

_ 4.— 

0 

T 

1 

.X. 

1 

"T“ 

1 

_x„ 

1 

1  check 

-X 

6 

■T™ 

1 

_x_ 

2 

i 

0 

T 

1 

^ ,  A— 

0 

1 

_x„ 

0 

1  check 

X 

7 

1 

.±_ 

0 

i 

0 

“  r 

I 

_x_ 

0 

“T* 

1 

_x» 

0 

1  check 

X 

8 

1 

-+- 

1 

i 

-+- 

1 

X 

1 

-+- 

0 

“T* 

1 

-+- 

0 

•  1 

1  check 
-+ - 

Table  (3.7)-  Results  Obtained  by  the  CHECK  Unit 


the  operands  of  the  following  on-line  unit. 
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3^.J  AN  Coded  Operands 

the  purpose  of  this  section  is  to  present  AN  Codes  in 
the  process  of  on-line  multiplication.  As  was  mentioned  ear¬ 
lier,  the  operands  and  the  results  are  shown  with  m  digits 
in  a  radix-r  redundant  number  system.  They  are  denoted  by 
X,Y  and  R  for  the  multiplicand,  multiplier  and  the  product 
respectively.  The  encoded  operands  are  obtained  by  just 
multiplying  each  digit  of  the  operands  by  the  check  modulus 
(A) .  Table  of  the  next  page  shows  the  correspondence  between 
the  two  sets  of  operands  and  the  results.  Note  that  since 

each  digit  of  the  multiplicand  and  the  multiplier  is  a  mul- 

2 

tiple  of  A,  then  the  product  digits  will  be  multiples  of  A 
and  not  A.  Therefore,  at  the  end  of  each  step  each  product 
digit  should  be  dividend  by  A  to  get  the  correctly  encoded 
product.  If  we  assume  that  A  is  a  low-cost  modulus,  this 
operation  will  be  trivial  [AVI  73]. 


m 

X*  L  x.r-1 
i*l  1 


and 


X' 


m 

x* 

i»l 


m  .  m 

Y-  E  y .  r-1  and  Y'»  E  y'  .r-1 
i»l  1  i-1  1 


m  m 

R=  p.r-1  and  R' 58  *-  p*  .r"1 
i=*l  1  i=l  1 

x^  *  {-p1 ,..., 1,0,1,. ..,p*}  and 

x  i  -4  { -Ap  r  «  •  . ,  A,  0,  A,  «  •  • ,  Ap  } 

i 

y 

y^  *  {-p*  ,  . . .  , 1, 0, 1,  . . .  ,p*  }  and 
Y  ^  ^  ( -Ap  ,  •  •  •  ,  A,  0,  A,  •  »  • ,  Ap } 

i 

f 

p^  ^  { — p ,  • • .,1,0,1, . . . , p }  and 

P  |  4  l “ A  p ,  •  •  * ,  —A  ,0,A  ,  •  •  . ,  A  p } 

i 


-M_<  X, Y  <M  and  -AM£  X*  ,Y'  <AM 
and  the  relation  between  the  corresponding  digits  of 
two  sets  of  operands  and  the  results  are: 


/ 


,m 


(3.55) 

(3.56) 

(3.57) 

(3.58) 
these 


(3.59) 
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The  algorithm  that  operates  on  encoded  operands  is  the 
algorithm  "MULT"  shown  in  Appendix  C.  The  proof  of  the  con¬ 
vergence  of  the  algorithm  to  the  correct  value  of  the  pro¬ 
duct  is  similar  to  that  shown  for  the  algorithm  “MULT" . 
Similar  to  Eq.  (C.10)  we  get: 

R'-X*Y*-r“m(P’m-p*m)  (3.60) 

By  devising  a  product  digit  selection  procedure  SELECT, 
in  step  4  of  the  algorithm  "MULT"  such  that: 

lp,m-e'«|iA2k  (3-61» 

R*=X'*Y'  can  be  computed  to  m  digit  precision.  The  least 
significant  half  of  the  product  is  available  as  the  redun¬ 
dant  output  of  the  adder  after  iteration  m+1,  i.e., 


P* 


m+1 


m 


-P 


m 


(3.62) 


^•2.2.1  Selection  of  The  Product  Digits 

Selection  of  the  correct  product  digit  is  of  great  im¬ 
portance  in  the  design  of  the  AN-coded  units.  Looking  back 
into  Eq.  (3.57)  we  deduce  that  the  correct  product  digit  is 
always  a  multiple  of  the  square  of  the  check  modulus. 
Derivation  of  the  bounds  on  the  encoded  partial  product  fol¬ 
lows  similar  path  as  that  explained  in  Appendix  C.  These 


bounds  are: 


A2(rk-2Mk' )^  P' ..  >A2 (-rk+2Mk '  )  (3.62) 

selection  equations  are  also  represented  by  a  P'-P*  plot.  It 

consists  of  a  family  of  curves  which  are  linear  function  of 

2  2 

with  p* j  as  parameter  ranging  from  -A  p  to  +A  p  in 
2 

step  of  A  .  The  area  between  maximum  P 1 ^  and  minimum  P ' ^ 
will  be  denoted  by  the  p'j=i'  region.  A  given  value  of 
P’j_^  and  P'j  will  correspond  to  a  point  in  an  i'  selection 
region.  The  product  digit  p’ j  is,  therefore,  i'  and  is  used 
in  forming  the  next  partial  product.  The  following  equation 
shows  these  regions: 

i’+A2k-2Mk' A2^  (P'j)  11  vL'-A2k+2Mk'A2  (3.63) 

when  j=m  in  (3.63); 

A2k-2Mk‘A2>  P'  -p'  >-A2k+2Mk’A2 

—  m  F  m  — 

2 

and  since  Mk’A  >0  then: 

A2k>  P'-p'  >-A2k 

—  m  m  — 

or 

I P  *  -P*  I  <A2k 

Therefore  the  relation  (3.61)  is  satisfied  by  the  a~ove 
selection  equations.  This  proves  that  R*  is  indeed  the 
correct  product  up  to  m  digits. 


3^  2^  2^  2^  Bounds  on  Operands 


Allowable  values  of  M  are  obtained  by  requiring  that 

2 

the  upper  bound  of  the  p' j-i#-A  selection  region  be  always 
greater  than  the  lower  bound  of  the  p*j=i*  region,  i.e. 


U 


2—  Li‘ 


i'-A 

inserting  the  values  from  Eq.  (3.63)  we  gets 


M<2k-1 

M-W“ 


(3.64) 


This  is  exactly  the  same  bound  we  obtained  in  Appendix 
C  (Eq,  C.22).  Therefore,  applying  AN-Codes  to  the  operands 
does  not  change  the  allowable  range. 


^•2*2.3^  An  Example  of  Multiplication  With  AN -Coded  Operands 
Assume: 

r=2 ,  k=k'  =1  and  A=»3 
applying  Eq,  (3.64)  we  get  M<_^. 

assume  Equations  (3.55)  to  (3.58)  result  in: 

f 

X*  y'  i  ■*  {3,0,3} 

p’i  {9,0,9} 

<  X 1  Y '  <— 

4-  X  ,Y  —4 

inserting  the  given  values  into  Eq.  (3.63)  we  get: 

i'+§>  P'  j  ii’-f  i'  4  {5", 0,9} 

Figure  (3.8)  depicts  this  set  of  equations. 
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As  an  example  assume  that  the  multiplicand  and  the  mul¬ 
tiplier  are: 

f 

X*=. 0033033303 
Y '  0030303330 

k 

Following  the  steps  of  the  algorithm  "MULT "  in  Appendix 
C,  Table  (3.8)  will  result. 


+ - + - + - + - + - + 

I  j  I  X'.  |  Y’.  I  P1.  Ip '.I 

II  3  I  3  I  3  I  3I 

+ - + - + - + - + - + 

I  1  I .0  1.0  1.0  I  0  I 

+ - + - , - + - + - + - + 

I  2  I  .0  1.0  1.0  I  0  I 

+ - + - + - ... - + - - - + - + 

I  3  1.003  1.003  1.009  |  0  ! 

+ - + - - - + - + - - - + - + 

I  4  1.0033  1.0030  1.009  I  0  I 

+ - + - ... - + - - - + - - - + - + 

I  5  1.00330  1.00303  1.0099  I  0  I 

+ - + - + - + - + - + 

I  6  1.003303  1.003030  1.09999  I  0  I 

+ - + - - — - - + - - - + - - -j- - +---+ 

I  7  1.0033033  1.0030303  1.9900099  I  9  I 

— - —  — —4* - - - 4 - — - —  —  -4- + 

I  8  | . 00370333  1.00303033  1.09009099  I  0  I 

+ + — ... + ... + - + + 

I  9  1.003303330  1.003030333  1.90909009  I  9  I 

+ - + - + - + - + - + 

1 10  1.0033033303  1.0030303330  1.900909999  I  9  I 

+ - + - + - + - + - + 

Table  (3.8)-  An  Example  of  AN-Coded  Multiplication 


From  this  table  we  get: 


F/6  q/2 


AD-A098  806  CALIFORNIA  UNIV  LOS  ANGELES  DEPT  OF  COMPUTER  SCIENCE 
ERROR-CODED  ALGORITHMS  FOR  ON-LINE  ARITHMETIC .< U ) 

FEB  81  A  GORJI-SINAKI  N00014-79-C-0866 

UNCLASSIFIED  UCLA-ENG-8197  NL 


I 


MICROCOPY  Rl  SOI  III  ION  HS1  (MAR! 


1 

p,ll"P^0"P'lO“(*0990900090)2 
and  therefore: 

X • *y • » ( . 00000090990990900090 ) 2 

The  necessary  condition  for  correctness  of  the  opera¬ 
tion  is  satisfied  because  all  the  digits  of  the  product  and 
partial  product  are  multiples  of  the  check  modulus  (A=3). 
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3.3  Error-Coded  On-line  Addition 


3^._3 . 1^  Res idue-Coded  Operands 

Assume  that  the  summands  A  and  B  are  represented  by  m 
digits  in  a  redundant  number  representation  system  (Eqs.  D.l 
and  D.2).  The  residue  encoded  summands  are  (A,  A')  and 
(B,  B' )  such  that: 

(A,  A' )«. (a1,a*1)(a2#a' 2) . . .(am,a'm) 

(B,  B' )=.(blfb'1)(b2,b'2)...(bm,b,m) 

The  relation  between  A  and  A'  (B  and  B' )  is: 


a’j/b'!  4  CO.l, ...,(A-1)}  (3.66) 

The  sum  R  is  shown  by  m+1  digits  also  in  a  radix-r 
redundant  number  system  (Eq.  D.3).  the  residue  encoded  sum 
is: 
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<R.  |RlA)-(s0,ls0lA)  *  (si*  lsilA),***(sm'lsin,A) 

is  assumed  to  belong  to  the  following  set: 

si  {-p,  . . .  ,T,0, 1,  . . .  ,p}  (r-l)^p>r/2  (3.67) 

A*  and  B’  in  going  through  the  RESIDUE  Unit  generate  a  resi¬ 
due  sum  which  is  represented  by  R’  such  that: 

m 

R 1 =  Z  s’ .r,_1 
i=0  1 

s' .  is  assumed  to  belong  to  the  following  digit  set: 

s'.  «  l-p',.  ...,p*  }  (r'-l)>  p'  >r'/2  (3.68) 

The  algorithm  run  by  the  MAIN  and  RESIDUE  Units  is  the 
algorithm  "ADD"  presented  in  Appendix  D.  Proof  of  the  con¬ 
vergence  of  this  algorithm  to  the  correct  value  of  the  sum 
is  given  in  the  same  appendix. 

The  Error  Detection  Algorithm 

In  this  section  the  algorithm  which  should  be  run  by 
the  CHECK  Unit  will  be  derived.  CHECK  Unit  starts  the  opera¬ 
tion  after  generation  of  s^  and  s\  by  the  MAIN  and  RESIDUE 
Units,  respectively.  It  examines  the  necessary  condition  for 
fault  free  operation  of  the  MAIN  and  RESIDUE  Units.  Unless 
this  necessary  condition  is  satisfied,  the  CHECK  Unit  stops 
the  operation  and  sets  an  error  flag. 
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To  derive  such  an  algorithm,  from  equation  (D.9)  of  Ap¬ 


pendix  D  we  get: 


2  j^2 

i.  (a.+b.  )r-1  =  i-  s.r”1 
i=l  11  i*0  1 


+  r”j+1P. 

3 


(3.69) 


For  the  RESIDUE  Unit  we  get  a  similar  expression: 


j  -i  j“2  -i  -i+1 

i-  (a'  .+b’  . ) r 1  1  -  i.  s '  . r '  1  +  r’  3+AP*  . 


i=l 


i  i 


i=0 


j  (3.70) 


Taking  the  residues  of  (3.69)  and  (3.70)  we  obtain 
(3.71)  and  (3.72). 


I  j  ,  I  I  I  j^2  _  J  _i+ 1  I 

I  i.  ( a  * . +b  ’  . ) r  1 1  =  II  i.  s.r  1 1  +  r  3  1 1 P  .  I  _  I 
li-1  1  1  I A  I  I i=0  1  |A  3  A|A 


(3.71) 


and 


Z  (a ' -+b' . )r*  I  =  I  I  Z 


_i>  llj-2  . 

•  1  I  I  I  «  >  _  •  1 


i=l 


l  i 


s.r' 


|A  I  I i*0  1  |A 


+r 


j'A 


|A 


(3.72) 


Assuming  |rlA*|r'|»  ,  from  (3.71)  and  (3.72)  we  get: 

llj-2  .1  ...  I 

I  I  Z  s.r"1!  +  r”3+1 IP  .  | . I  = 


i*0 


|A 


j'A 


|A 


llj-2  .1  ...  I 

I  I  Z  s’  r  I  +  r  3+1 1 P'  .1.1 

I |i=0  1  |A  3  A|A 


(3.73) 


This  is  the  relation  that  CHECK  Unit  verifies  at  the 
end  of  each  step.  To  further  simplify  the  detection  process 
we  assume  |rlA“|r* |*=1.  As  a  result,  (3.73)  reduces  to: 
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(3.74) 


I  I j-2  I 

II  Z  s.l 

I  I i— 0  X|A 


1  I  I  J-A 

+  IP.LI  -  II  a\ 

3  A|A  I li-0  * 


lA 


J  «■ 


|A 


Defining  the  following  dummy  variables: 


X 


j-2 


I  j-2  I 


I  i-  s. 
li-0  ' 


Ia 


X' 


j-2 


I  j-2 
I  Z  s' 
I  i-0 


1 1 A 


Eq .  (3.74)  becomes : 


!xi-2+lpj'*U  -  |x,j-2 


The  algorithm  "DETECT  ADD" 
the  CHECK  Unit  and  verifies  Eq. 


+'P' j*A|A 


of  the  next  page  is 
(3.75) . 


(3.75) 
run  by 
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Algorithm  "DETECT  ADD" 
Step  1  [Initialization]: 


X_2-0,  X’_2-0 


For  j»l, 2, . . . . ,m+2  Do: 


Step  2  [Input  Digits]: 


x .  -=lx.  ,+s.  ,L 
j-2  j-2  A. 


X'j-2-|X'j-3t*'j-2lA 


|PjlA  and  iP'jln 


Step  3  [Check  for  Error] : 


zJ-2'lxj-2+|[>jlila 


X'j-2*IX,3-2+l?'jl&U 


If  t  z3_2  I*  z’  j_2  > 


THEN  E— 1 ,  GOTO  ERROR  SUB 


3 .3 .1.2  An  Example  of  The  Error  Detection  Process 


In  what  follows  a  numerical  example  of  the  residue- 
coded  addition  will  be  given. 

Assume: 

i  i 

r*10,  p~p  «9 
r’-4,  p ' =3 
A*3 

k 

Therefore: 

k»k'»k'  «1 

Fran  Eqs.  (3.65),  (3.66),  (3.67)  and  (3.68)  we  have: 
a^,b^,s^  *  {9, 8, . . . , 1 , 0, 1, . . . , 8, 9} 

«  {0,1,2} 

«  {3,2,I,0,1,2,3} 

The  encoded  summands  (A, A' )  and  (B,B‘ )  are  assumed  to 

be: 


(A,  A*  )-. (9,0) (2, 2) (4,1) (7,1) (0,0) (5, 2) 

(B,B‘ )-. (8, 2) (4,1) (3,0) (5, 2) (6,0)  (7,1) 

Following  the  algorithm  "ADD"  in  Appendix  D,  Table 
(3.9)  will  be  obtained.  This  table  summarizes  the  results 
obtained  by  the  MAIN  Addition  Unit  during  various  steps  of 
the  ADD  algorithm.  According  to  this  table: 


Table  (3.9)-  Results  Obtained  by  the  MAIN  Unit  (EX.  3. 3. 1.2) 


R=R?»1. 768272 


Table  (3.10)  summarizes  the  results  obtained  by  the 
RESIDUE  Addition  Unit.  Prom  this  table  we  get: 

R’«R’7=(0.3T2TlT)4«( .231303 )^ 

The  information  needed  by  the  CHECK  Unit  are:  sj_2  ' 
s*  .  _  ,  |P  .  L  and  I P  *  .  I » .  Table  (3.11)  is  obtained  by  this 

j— *  j  A  j  A 

unit  using  the  algorithm  "DETECT  ADD"  and  summarizes  the 
operation  of  the  CHECK  Unit  for  this  example. 


This  table  indicates  that  the  necessary  condition  for 
correctness  of  the  operation  is  satisfied  for  every  step  of 
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j 

1 

1 

.J.M 

P‘d 

-+- 

1 

1 

s'j-i 

i  ^ 

x. 

1 

1 

.X. 

0.2 

1 

0 

1  .0 

_ 4  _ _ _ —  _ 

2 

1 

•4”- 

0.23 

1 

3 

1.3 

X.  __ 

3 

l 

-0.3 

1 

T 

T  — 

1  .23 

X. 

4 

i 

.X, 

1.3 

"T" 

i 

«,4_ 

2 

“*T  - ■ 

1 .232 

X. 

5 

1 

-1.0 

—  T™ 

1 

_4- 

I 

“ - * 

1 .2313 

X. 

6 

1 

0.3 

—  I  — 

1 

.X. 

1 

1 .23131 

X. 

7 

i 

-+- 

-1.0 

■*T* 

i 

T 

• — 

1 .231303 
-  + - 

+ 


+ 

I 

+ 

I 

+ 

I 

+ 

I 

+ 

I 

+ 

I 

+ 

I 

+ 


Table  (3.10)-  Results  Obtained  by  the  RESIDUE  Unit 


the  ADD  algorithm  ( Z.._2=Z  *  ,._2  for  j=l,..,8). 

In  order  to  demonstrate  the  error  detection  capability 
of  the  proposed  scheme,  assume  due  to  an  error  in  the 
multi-input  adder  of  the  MAIN  Unit,  the  sign  bit  of  P ,  has 

D 

been  inverted.  Such  that: 

P6,-2.8  ->  lP6lA=2 

P*-2.8  ->  |P*IA-1 

Following  the  "DETECT  ADD"  algorithm  from  step  j*6  we 

get: 

j-6 
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35 
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—4 
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8 
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4" 
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BE 

BE 
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BS 

— 
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—4 

Table  (3.11)-  Results  Obtained  by  The  CHECK  Unit 


X4= 1 1+3  I 3=1 
x'4=Ii+TI3=o 

Z4»|l+1 | 3=2 

z'4»lo+ol3=o 

* 

Since  Z4fZ'4  ,  this  error  will  be  detected  by  the 


Unit. 


2*2*2  Coded  Operands 

In  this  section  the  imposition  of  AN  codes  on  on-line 
addition  will  be  considered*  On-line  subtraction  can  be  re¬ 
placed  by  addition  by  just  flipping  the  signs  of  the  sub¬ 
trahend  digits.  As  before,  the  encoded  summands  (A*  and  B' ) 
are  obtained  by  multiplying  each  of  their  digits  by  the 
check  modulus. 

m  m 

A  *  =  £  (Aa  .  )  r~x=  a*  .  r~X 

i=l  i=l 

m  m 

B'=  Z  (Ab.  )r_1=  i_  b*  .  r"1 
i=l  1  i=l  1 

m  m 

R' =  Z  ( As . ) r_1=  £  s‘ . r_1 

i=0  1  i=0  x 

a'^  ,  b' ^  and  s’^  belong  to  the  following  digit  sets: 

a  ^  A,  0,  A,  *  *  *  f  Ap  }  ( 3  •  7 6  ) 

b  .  *4  { -Ap  f  •  •  •  /  A  #  0  /  A  /  •  >  •  /  Ap  }  (3.77) 

s' .  <  (-Ap, . . . , A,0f A, . . . , Ap}  (3.78) 

It  is  clear  that: 

Ak * >  A*  >-Ak ' 

II  If 

Ak  >  B*  >-Ak 

A(k'+k* ' )>  R*  >-A(k’+k ' ' ) 
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Algorithm  "ADD",  shown  in  Appendix  D,  is  imposed  on  the 
encoded  operands  A'  and  B* .  With  a  similar  derivation  the 
following  relation  will  be  obtaned: 

R’^A'+B')  -  r~m  (3.79) 

The  sum  digit  selection  procedure  in  step  4  of  the  al¬ 
gorithm  "ADD"  should  be  such  that  the  following  relation  is 
satisfied : 

|P‘  .-s'  I <Ak '  (3.80) 

m+i  m 

Only  in  that  case  R*  represents  the  sum  of  A'  and  B'  to  m 
digits  of  precision. 

Selection  of  The  Sum  Digits 

Sum  digits  should  be  selected  in  such  a  way  that  they 
are  always  multiples  of  the  check  modulus  (A).  Using  the 
basic  recursion  formula  (Eq.  D.8  in  Appendix  D)  and  follow¬ 
ing  a  similar  procedure  given  in  that  appendix,  the  bounds 
on  the  correct  partial  sum  will  be  obtained. 

a1  .+b‘  .  aVb*i 

rkA  -  TTr-TT*-  Pj  -~rkA  ~  T(r-1-)3  (3-81) 

The  selection  region  i'  ,  is  a  region  in  which  s'  j_^  =  i' 
is  a  correct  sum  digit.  This  area  is  represented  by  the  fol¬ 
lowing  inequality. 


s 


(3.82) 


a'.+b’.  S'.  ,«!’ 

i'+kA  ~  r(r-l)  -  (p‘  3  >i'-kA  - 


a'  .+b*  . 
rfr-1 ) 


In  deriving  this  relation  we  have  followed  a  similar  path 
shown  in  Appendix  D  for  deriving  Eq.  (D.23). 

In  order  to  have  overlaps  between  regions  s'j_^=i'  and 
s'j-^i'^  the  following  inequality  should  hold: 

Ui'— Li'+A  for  all  values  of  i* 

The  only  requirement  to  satisfy  this  relation  is  k>i. 
Therefore,  if  the  sum  (R')  is  in  a  redundant  number 
representaion  system,  there  are  always  overlaps  between  the 
adjacent  regions  even  if  the  summands  are  not  redundant. 


An  Example  of  Addition  With  AN -Coded  Operands 

Assume: 

«  i 

r=10,  k=k ‘ =k  =1,  m=8  and  A=7 
From  Equations  (3.76),  (3.77)  and  (3.78)  we  get: 

aV,  b'^,  si  4  {63 , 56,  .  .  . ,  7, 0,  7,  .  .  . ,  56,  63  } 

As  a  numerical  example  assume  the  summands  (A*,  B* )  are: 

A'-. (63) (56) (63) (42) (35) (14) (7) (63) 

B,=.(63)(63)(6T)(56)(42)(63)(l4) (7) 

Applying  the  "ADD”  algorithm  on  these  sets  of  operands, 
we  obtain  Table  (3.12).  From  this  table  the  value  of  sum 
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j 

-+ - +  “ 

1  P*  •  1 

1  3  1 

______ 

s'j-l 

1 

1  (7  )  ,  ( 56) i 

14 

2 

i  (o).(2i)i 

_ |. 

0 

3 

i  (28) .(56) i 

_ 

35 

4 

1  (0 ) . (42 ) 1 

”4*  — * + 

7 

5 

1 (14) .(63) | 

_ _ . 

21 

6 

1(14). (7)  | 

“+  r- 

14 

7 

1  (0 )  .  (63 ) | 

“4“  -r  -r 

7 

8 

1  (0).(0)  1 

4* 

0 

9 

1  (0).(0)  l 
-+ - +- 

0 

Table  (3.12)-  An  Example  of  AN-Coded  Addition 


R 1  =  ( 14 ) .(0)(35)(7) (21) (14) (7 ) (0) (0) 


the  necessary  condition  for  the  correctness  of  the 
operation  is  satisfied  because  all  the  digits  of  the  sum  and 
partial  sum  are  multiples  of  the  check  constant  (A«7). 


CHAPTER  4 


IMPLEMENTATION  CONSIDERATIONS 

In  this  chapter  a  hardware  organization  of  the  proposed 
error-coded  on-line  units  will  be  presented.  In  Chapter  3  we 
proposed  two  methods  for  detection  and/or  correction  of  er¬ 
rors  in  an  on-line  unit.  These  methods  were:  1)  use  of  a 
residue  unit  along  with  the  residue  coded  operands ,  2)  use 
of  AN  coded  operands  along  with  one  (MAIN)  unit  and  the 
corresponding  CHECK  unit.  It  is  obvious  that  in  the  first 
case  a  CHECK  unit  is  also  needed  to  compare  the  results  of 
the  MAIN  and  RESIDUE  processors. 

The  operation  of  each  of  these  units  has  already  been 
explained  in  Chapter  3  (see  the  corresponding  block  di¬ 
agrams)  .  It  is  the  purpose  of  the  current  chapter  to  consid¬ 
er  the  hardware  realization  of  each  of  these  units.  At  the 
end,  using  this  realization  an  estimate  of  the  gate  and 
memory  requirements  of  the  error-coded  on-line  unit  will  be 
given.  In  order  to  do  this,  we  start  with  the  operation  of 
a  residue-coded  divide  unit.  The  extension  of  this  work  to 
other  basic  operations  { addition/ subtraction  and  multiplica¬ 
tion)  is  straight  forward  and  will  not  be  considered  in  this 
thesis • 
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4.1  Design  of  The  Error-Coded  Division  Unit 


As  we  mentioned  in  previous  chapters,  a  residue-coded 
on-line  division  unit  consists  of  the  following  elements: 

1.  A  Radix  r  (=2*)  MAIN  Unit 

2.  A  Radix  r'  (*2k)  RESIDUE  Unit 

3.  A  CHECK  Unit 

In  what  follows  the  hardware  design  of  each  of  these 
components  will  be  considered  and  an  estimate  of  the  cost  of 
each  unit  will  be  given. 


4.1^.1_  Design  of  The  Residue-Coded  MAIN  Unit 

The  design  of  the  MAIN  Unit  when  no  error-detection 
scheme  is  used  has  been  given  in  the  Appendix  A.  In  this 
section  we  modify  this  design  to  make  the  same  unit  suitable 
for  the  case  when  residue  coded  operands  are  used. 


From  the  algorithm  "DETECT  DIVIDE"  in  Chapter  3  it  is 
clear  that  the  residue  of  the  partial  remainders  P  ^  and  P 1 ^ 
are  needed  by  the  CHECK  Unit  at  every  step  of  the  on-line 
division  algorithm.  These  residues  should  be  obtained  by 
Processing  Elements  inside  the  MAIN  and  the  RESIDUE  Units  in 
such  a  way  that  the  modularity  of  the  units  is  preserved. 
Having  this  in  mind,  the  following  scheme  for  determination 


of  I? j lA  and 


|PfjlA  is  proposed. 


Calculation  of  The  Residues  of  Partial  Remainders 

From  the  description  of  the  hardware  design  of  the  Pro¬ 
cessing  Elements  (PE’s),  we  know  that  partial  remainders  are 

represented  by  m  digits.  Each  PE  contains  one  digit  (radix 

k  (  i ) 

2  )  of  the  interim  partial  remainder  (w^J  )  and  the 

corresponding  transfer  function  (t|^)  such  that: 


(4.1) 

and  similarly: 

J  i=l  1  i=ll  1  1  J 

(4.2) 

Therefore: 

1  m  ,  . »  .1 

■Va-| £ '*15V  '*  \\k 

-1  E  I  I.  +  ItP’  l.j,  *  1  r" 1 1 . 1 

|i-l 1  1  A  1  AlA  a(a 

(4.3) 

Assuming  |rlA=lr'lA=l  we  get: 

Ip  .  |  J\  £  j  |wf3)  |  +|T(  j)  |  J  J 

'  j 'a  |ii1l,Wi  'A+ITi  A | A  j  A 

1  m  ,  • ,  1 

=1  z  rJ3)i 
li*l  Ia 

where  is  defined  as: 

(4.4) 

R(j)J|w(j)i  +iT(j)i  1 

Ri  |,wi  A+  Ti  A|A 

(4.5) 

Using  Eg.  (A. 9)  in  Appendix-A  we  get: 
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These 


r|^'s  are  computed  by  Processing  Elements, 
residue  digits  [a  bits  each]  are  sent  to  the  CHECK  Unit. 
This  unit  adds  these  residues  and  finds  the  residue  of  the 
result  in  order  to  obtain  lpjlA  according  to  Eq.  (4.4). 

PE^  obtains  r|^  by  adding  the  residues  of  the  values 
in  RW,  TA,  TP1  and  TP2  registers  and  finding  the  residue  of 
the  result  (Eq.  4.6).  These  values  are  obtained  as  described 
next. 

Computation  of  w|  ^  : 

Since  wj^  is  a  radix-r  Sign  and  Magnitude  digit  (k+1 
bits),  its  residue  is  obtained  simply  by  a  two-stage  ROM 
device  with  the  capacity  of: 

M  =at[2k+2ac+1]  bits  (4.7) 

w 

Px(j)  P2 ( j ) 

Computation  of  t^  and  t^  : 

These  two  transfer  functions  have  a  similar  form  and 
k ( k-1 ) 

consist  of  --~2 — "  magnitude  bits  and  one  sign  bit.  They  are 
in  the  following  form: 
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“1 ) 

Computation  of  t^  J  : 

A(  i ) 

t^  J  is  the  transfer  function  out  of  the  Multi-Input 
Adder  (MIAD)  and  is  represented  by  (2k+l)  redundant  binary 
digits  {  -*{r,  0,1}  ). 

•  A(  i )  • 

In  order  to  obtain  the  residue  of  t^  J  with  respect  to 
A,  (2k+l)  digits  of  it  are  grouped  into  groups  of  2  digits 
each.  The  residue  of  all  groups  are  obtained  simultaneously, 
the  results  are  grouped  again  and  this  process  continues  un- 

A(  i ) 

til  the  residue  of  t.  J  is  obtained.  This  scheme  is  shown 
in  Figure  (4.2)  for  k=8  and  A=3. 

Number  of  levels  required  is: 

L*^log2 (2k+l )|  (4.10) 

Therefore,  the  time  required  for  this  process  (tTA)  is: 

tTA*LtM“[1°g2(2k+1)]tM  (4.11) 

The  memory  required  (M^)  is: 

MTA=k24*2  +a{^24  -t—22ot  +|22<x  + . +22®} 

This  function  can  be  approximated  by: 

MTA»32k  +8atk  +22<X*<x*"~L  (4.12) 

According  to  Eq.  (4.6)  these  residues  should  be  added 
to  obtain  inside  the  i-th  Processing  Element.  The  fol¬ 

lowing  organization  is  proposed  to  perform  this  modular  ad¬ 
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dition 


The  memory  required  for  this  process  (MADD)  is: 

W*  [3*22a>3<x  22a 

Therefore,  the  total  ROM  required  to  obtain  rM 
and  according  to  Equations  (4.7),  (4.8), 

and  (4.13)  is: 

Mr=cx  2k[5-2”at+L]  +ot  C8k+2<x2+2-2ac+1] 

+at  22®  +3 2k  (bits)  for  k>  at+1 

The  time  required  for  this  process  is: 

(2k+l)l)+2 


(4.13) 

from 

(4.12) 

(4.14) 


(4.15) 


4.^.2^  Design  of  The  RESIDUE  Unit 

Since  the  RESIDUE  and  the  MAIN  Units  are  similar,  from 
Eq.  (A. 30)  in  Append ix-A  we  get: 


>  2, 


GPE'“64k  +157k'+123 


(4.16) 

and  also  the  pin  requirements  of  a  Processing  Element  of  the 
RESIDUE  Unit  is: 


Ppg^lSk’+g 


.<j) 


The  amount  of  memory  required  to  compute  R* from 


w'f j)  and  T'!^  similar  to  Eq.  (4.14)  is: 

i  l  ^ 


Mn,=at  2k’[5-2'at+1]  +ac  C8k’+2<x  +2-2<x+1] 

K 


+a  22ct  +32k‘  (bits)  for  k’>  ac+1  (4.18) 


and  also  the  time  is  similar  to  Eq.  (4.15). 


4. 1^.3^  Design  of  The  CHECK  Unit 

The  CHECK  Unit  receives  its  inputs  from  the  MAIN  and 
RESIDUE  Units.  These  inputs  include: 

1.  The  corresponding  digit  of  the  dividend  and  its  residue 
(ni  and  n* i) 

2.  The  corresponding  digit  of  the  divisor  and  its  residue 
(di  and  d^) 

3.  Output  digits  of  the  MAIN  and  RESIDUE  Units  (q^  and  qV) 

4.  r!^  and  R1!^  from  the  corresponding  PE's  of  the  MAIN 


and  RESIDUE  Units . 


These  inputs  are  shown  in  Figure  (4.3). 


OUTPUT  DIGITS 


FROM  MAIN  UNIT 


FROM  RESIDUE  UNIT 


INPUT  DIGITS 

Figure  (4.3)-  Inputs  to  The  CHECK  Unit 


The  numbers  shown  on  the  block  diagram  belong  to  the  case 
where  r=10,  r'=4  and  A=3 .  In  this  case: 

Total  Number  of  Inputs  =4m  +18  (bits) 

Inside  the  CHECK  Unit  ^ 1 s  are  added  to  generate 
|PjlA  and  R'l^'s  are  added  to  generate  lp'jlA-  Therefore, 
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by  looking  at  the  algorithm  run  by  the  CHECK  Unit  (Algorithm 
"DETECT  DIVIDE"  in  Chapter  3)  the  block  diagram  of  this  unit 
will  be  obtained  as  shown  in  Figure  (4.4). 

In  what  follows  the  hardware  implementation  of  each  of 
the  components  of  the  CHECK  Unit  will  be  considered. 

Block  No.  1_ 

This  block  adds  n' [ac  bits]  to  Yj_^[ac  bits]  and  ob¬ 
tains  the  residue  of  the  result.  Therefore: 

time  required 

ROM  needed  =22cx*ac  (bits) 

k 

Blocks  No.  2  and  4 

These  two  blocks  are  similar  and  their  hardware  re¬ 
quirements  are: 

t2*t4'2t« 

M2=M4»C2lt+1+22a]a  (bits) 

Block  No.  3 

This  block  adds  q‘ .  (k‘+l)  to  X1  .  ^  C®  hits]  and  ob¬ 
tains  the  residue  of  the  result. 
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Figure  4.4  -  An  Imptenwntation  of  Tht  CHECK  Unit 


(K+1)  }  (K+1) 


^M3=[2k+1+22a]*<x  (bits) 

Blocks  No.  _5  and  6 

Block  5(6)  subtracts  |P.L  (  I P '  .  I -  )  from  Y.  (Y'.)  and 

J  A  J  A  J  J 

obtains  the  residue  of  the  result.  Therefore: 

f 

M5=M6=22<X*at  (bits) 
t5=t6=tM 

i 

Blocks  No.  7,  8 ,  9  and  10 

Block  7(8)  multiplies  two  digits  of  oc  bits  each.  Block 
9(10)  obtains  the  residue  of  the  result.  Therefore,  the  com¬ 
bination  of  blocks  7  and  9  (8  and  10)  requires  the  following 
amount  of  hardware: 

i* 

M7+M9=M8+M10=22at*oc  (bits) 
t7+t9=t8+t10=tM 


Block  No.  1 1 

This  is  a  simple  comparator  which  compares  two  residues 
of  oc  bits  each.  This  block  can  be  implemented  by  a  level  of 
exclusive  OR's  followed  by  an  OR  gate.  Therefore,  we  need  oc 
XOR's  and  one  large  OR  gate.  Assuming  3  gates  per  XOR  and 
two  gate  delay  for  each  we  find: 
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( 


Gn=3ac+1 


Blocks  No.  12,13,14  and  15 


Block  12(13)  adds  m  operands  of  at  bits  each.  Block  14 
(15)  obtains  the  residue  of  the  result.  The  combination  of 
these  two  blocks  is  realized  by  levels  of  ROM  devices  as 
shown  in  Figure  (4.2).  The  number  of  levels  required  is: 


L=|riog2mj  (4.19) 

Figure  (4.5)  depicts  this  organization  when  m=8  and 


A=3 


The  number  of  modules  required  for  this  process  is: 

No.  of  Modules  =1+2+2 2+ . +^  =  m-1 

Therefore,  the  memory  required  is  : 

M12+Mi4=Mi3+Mi5=  (m-i  )*22<x*at  (4.20) 

From  (4.19)  the  time  required  is: 
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4.2  Cost  of  The  Residue-Coded  Division  Unit 


We  assume  that  the  number  of  gates  and  the  amount  of 
memory  required  for  a  unit  is  an  indication  of  its  cost. 
Therefore,  in  this  section  the  overall  gate  complexity  of 
the  residue-coded  on-line  division  unit  will  be  considered. 

4.^2. 1^  Cost  of  The  MAIN  Unit 

The  number  of  gates  required  for  a  Processing  Element 
of  the  MAIN  Unit  is  (see  Appendix-A,  Eq .  A. 30): 

GpE=64k  2+l 57k+l 23  (4.22) 

On  the  other  hand  incorporation  of  error  detection 
schemes  requires  addition  of  extra  hardware  to  each  Process¬ 
ing  Element.  This  extra  hardware  is  in  the  form  of  a  ROM 
module  added  to  each  PE.  The  capacity  of  this  ROM  (MEC_pE) 
is  given  by  Eq.  (4.14).  Therefore  the  hardware  requirements 
of  each  residue-coded  Processing  Element  is: 

GEC-PE-<3PE-64,t2+157k+l23 
'MEc_pE-MR=a  2k[5-2_a+1]  +  at[8k+2at2+2-2ac+1] 

+at  22<x  +32k  (bits) 

Z  (4.23) 

Since  the  MAIN  Unit  is  composed  of  m  PE's,  gate  and 
memory  required  for  this  unit  is: 
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^c-main’^ec-pe 

MEC-MAIN=mMEC-PE 


(4.24) 


4.2.2  Cost  of  The  RESIDUE  Unit 

The  only  difference  between  the  MAIN  and  the  RESIDUE 
Units  is  in  their  radices  of  implementation  (r  and  r')« 
Therefore  the  total  cost  of  the  RESIDUE  will  be  given  by  an 

equation  similar  to  (4.24)  .  That  is: 
r 

GEC-REs'm( 64k ’ 2+1 57k ' +123 ) 

MEC-RES=ml<X  2k'[5-2"<Xfl]+a[8k,+2at2+2-2<X+1] 

+at  22<X  +3 2k *  }  (bits) 

2  (4.25) 

4.2.3  Cost  of  The  CHECK  Unit 

Hardware  requirements  of  the  CHECK  Unit  can  be  obtained 
by  adding  the  hardware  needed  for  each  of  its  components. 
Looking  back  to  Section  4.1.3  we  get: 

t 

1J 

MCHECK“i^1Mi 

GCHECK*G11 

tCHECK=MAX(t12+t14' t4)+t5+t7+t9+tll 
.  (4.26) 


Using  the  values  from  Section  4.1.3  into  (4.26)  we  get: 


L2 


Unit  Whtn  f  -4(p’-3i  and  A-3(  a  -2) 


CHAPTER  5 


PERFORMANCE  EVALUATION 
j>  .1_  Code  Performance 

The  purpose  of  this  chapter  is  to  analyze  the  effect  of 
imposing  error  codes  on  the  existing  on-line  algorithms. 
The  economic  feasibility  of  arithmetic  error  codes  in  a  com¬ 
puter  system  depends  on  their  cost  and  effectiveness  with 
respect  to  the  set  of  arithmetic  algorithms  and  their  speed 
requirements.  The  choice  of  a  specific  code  from  the  avail¬ 
able  alternatives  further  depends  on  their  relative  cost  and 
effectiveness  values. 

Arithmetic  error  codes  are  of  special  interest  in  the 
design  of  fault-tolerant  computer  systems,  since  they  serve 
to  detect  (and  correct)  errors  in  the  results  produced  by 
arithmetic  processors  as  well  as  the  errors  which  have  been 
caused  by  faulty  transmission  or  storage.  The  same  encoding 
is  applicable  throughout  the  entire  computing  system  to  pro¬ 
vide  concurrent  diagnosis,  i.e.,  error  detection  which  oc¬ 
curs  concurrently  with  the  operation  of  the  computer.  Real 
time  detection  of  transient  and  permanent  faults  is  obtained 
without  a  duplication  of  arithmetic  processors.  This 
chapter  presents  the  result  of  an  investigation  of  the  cost, 


speed,  interconnection  requirements  and  effectiveness  of  ar¬ 
ithmetic  error  codes  in  on-line  networks.  We  focus  our  at¬ 
tention  on  the  residue  and  AN  coded  division  units.  The 
results  obtained  can  be  extended  to  other  on-line  operations 
with  the  corresponding  modifications. 

5^.1^  Hardware  and  Interconnection  Requirements 

We  define  the  "perfect  unit"  to  be  a  unit  in  which  log¬ 
ic  faults  do  not  occur.  The  specified  set  of  arithmetic  al¬ 
gorithms  is  carried  out  with  prescribed  speed  and  without 
errors.  For  a  given  algorithm,  word  length,  and  number 
representation  system  of  the  perfect  unit  the  introduction 
of  any  code  will  result  in  changes  that  represent  the  cost 
of  the  code.  The  components  of  the  cost  are  discussed  below 
in  general  terms  applicable  to  all  arithmetic  error  codes 
[AVI  71]. 

1)  Word  length:  The  encoding  introduces  redundant  bits 
in  the  number  representation.  A  proportional  hardware  in¬ 
crease  takes  place  in  storage  arrays,  data  paths,  and  pro¬ 
cessor  units.  The  increase  is  expressed  as  a  percentage  of 
the  perfect  design.  "Complete  duplication"  (100  percent  in¬ 
crease)  is  the  encoding  which  serves  as  the  limiting  case. 
In  residue  encoding,  the  residue  of  each  digit  with  respect 
to  A  is  attached  to  it  and  should  be  carried  along  with  the 
corresponding  digit.  Assuming  that  the  operands  and  the 
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results  belong  to  a  redundant  number  system  we  have: 


x { “p ,•••••, “1 |Ofli ( r- 1 ) ^  p  ^r / 2 

The  corresponding  operands  in  the  RESIDUE  unit  belong 
to  the  set: 


x*  ^*{0, 1, 2, . ,  ( A-l )  }  A<r 1 

The  number  of  bits  required  to  represent  x^  is: 
n*  |^log22p| 

Similarly,  the  number  of  bits  required  to  represent  x*^  is: 


Therefore,  all  the  data  paths  should  be  increased  by  the 
factor  of  n'/ru 

n’  /n=|,log2Aj/|'log22pj:Jlog2pAj  (5.1) 
Also  all  the  storage  requirements  of  the  units  will  increase 
by  the  same  factor. 


When  using  AN  codes,  digits  of  all  the  operands  belong 
to  the  following  set: 


x  ^ ^  —  Ap  -A ,  0,  A, . .  Ap } 

Therefore,  the  total  number  of  bits  required  is: 

n’*p.og22Apj 

and  the  factor  by  which  the  word  length  increases  isi 


A=(n’-n)/n 
For  example  if: 

r=10,  p=5  and  A=7 

Then : 


(5.2) 


n=4  and  n* =7 
This  results  in: 

/>=  75% 

2)  The  Checking  Algorithm:  This  test  the  code  validity 
of  every  incoming  operand  and  every  result  of  an  instruc¬ 
tion.  A  correcting  operation  follows  when  an  error- 
correcting  code  is  used.  The  cost  of  the  checking  algorithm 
has  two  interrelated  components:  the  hardware  complexity  and 
the  time  required  by  checking.  The  complete  duplication  case 
requires  only  bit  by  bit  comparison?  other  codes  require 
more  hardware  and  time.  Provisions  for  fault  detection  in 
the  checking  hardware  itself  are  needed  and  add  to  the  cost. 

In  the  residue  scheme/  the  checking  is  done  by  the 
CHECK  unit  and  consists  of  comparing  the  outputs  of  the 
RESIDUE  and  the  MAIN  units.  This  operation  is  performed  by 
the  "DETECT  DIVIDE"  algorithm  mentioned  in  Chapter  3.  There¬ 
fore/  the  only  extra  hardware  we  require  for  checking  algo¬ 
rithm  is  the  CHECK  unit.  A  sample  block  diagram  of  this  unit 
is  shown  in  Section  4.1.3  of  this  thesis.  By  referring  to 
this  figure/  we  note  that  the  hardware  required  to  imple-* 
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in  the  next 


ment  such  unit  is  not  complex  at  all.  Also, 
section  we  prove  that  the  checking  procedure  does  not  intro¬ 
duces  any  delay  into  the  operation  of  the  on-line  units. 

3)  The  Arithmetic  Algorithms:  An  encoding  usually  re¬ 
quires  a  more  complex  arithmetic  operation  than  the  perfect 
computer.  This  cost  is  expressed  by  the  incremental  time 
and  hardware  required  by  new  algorithm.  As  was  mentioned 
earlier  in  this  thesis,  the  algorithms  used  by  the  error 
coded  units  are  not  different  from  those  used  by  the  ordi¬ 
nary  units.  Therefore,  imposing  error  codes  on  on-line  un¬ 
its  does  not  add  any  cost  of  this  type.  Also,  note  that  we 
do  not  require  new  algorithms  for  the  residue  units.  The  al¬ 
gorithms  "RESIDUE  OP"  are  exactly  the  same  as  "MAIN  OP"  al¬ 
gorithms,  but  they  are  run  on  the  residue  operands. 

!5. 1^.2^  Time  Requirements 

Introduction  of  error  detection  schemes  into  the  opera¬ 
tion  of  an  on-line  divide  unit  results  in  increase  of  the 
basic  recursion  step  time  (T----,).  This  increase  in  time  is 

olbr 

due  to  the  following  two  factors: 

1.  Time  required  to  compute  R f ^ 1 s  and  R'j-^'s  in  the  i-th 
on-line  Processing  Element  (tr). 

2.  Time  required  by  the  CHECK  Unit  (Eq.  4.27) 
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Time  requirements  of  the  MAIN  on-line  divide  unit  is 
considered  in  Appendix  B  of  this  thesis.  From  Eq.  (B.15)  in 


this  appendix  TgTEp  is: 

TSTEP-*.+V-t.+tre*  1 0*+7  [l°92  <*+l  )|«4  )Sg  (5.3) 


sion 
be : 


Adding  1  and  2  above  to  this  equation,  the  basic  recur- 
step  time  for  the  residue-coded  units  (TRC_STEp) 


T  ss  t  4-+-  +  +*  4-4-  ft;  A) 

RC-STEP  s  PE  r  THECK  1  ; 

The  process  of  obtaining  rM^'s  can  be  started  as  soon 
as  the  registers  TP1,  TP2,TA  and  RW  are  loaded  with  the 
correct  values.  Having  this  in  mind  the  graph  representation 
of  TRC_STEp  will  be  obtained  as  shown  in  Figure  (5.1). 

As  this  diagram  indicates,  while  the  CHECK  Unit  is  exa¬ 
mining  the  results  of  the  j-th  step,  MAIN  and  RESIDUE  Units 
are  in  the  (j+l)-th  step.  This  is  possible  because  for  all 
values  of  the  radix  (r)  the  following  inequality  is  satis¬ 
fied  : 


ts+tPE>  tr+tCHECK  for  all  r's  (5.5) 

This  means  that  the  results  of  the  j-th  step  can  be 
checked  by  the  CHECK  Unit  while  (j+l)-th  step  is  in  pro¬ 
gress.  Therefore,  there  is  no  time  penalty  involved  in  in¬ 
troducing  the  check  procedure.  That  is: 


Figure  5.1  -  Graph  Rapraaantation  of  T 


2.  decreases  as  the  radix  of  the  MAIN  Unit  (r)  increases. 


This  implies  that  it  is  beneficial  to  use  higher  rad¬ 
ices  for  the  MAIN  Unit.  Also  it  is  clear  that  in  order  for 
the  design  to  be  economically  feasible,  the  radix  of  the 
MAIN  Unit  should  be  greater  than  the  radix  of  the  RESIDUE 
Unit  (r’ ) . 
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5.2  Code  Effectiveness 


An  arithmetic  error  occurs  when  a  logic  fault  causes 
the  change  of  one  or  more  digits  in  the  result  of  an  algo¬ 
rithm.  A  logic  fault  is  defined  to  be  the  deviation  of  one 
or  more  logic  variables  from  the  values  specified  in  the 
perfect  design.  Logic  faults  differ  in  their  duration,  ex¬ 
tent,  and  nature  of  the  deviation  from  perfect  values.  The 
effectiveness  of  an  arithmetic  error  code  in  a  computer  may 
be  expressed  in  two  forms:  as  a  direct  value  effectiveness, 
and  as  a  design-dependent  fault  effectiveness  [AVI  71]. 

1)  Value  Effectiveness:  The  most  direct  measure  of  ef¬ 
fectiveness  is  the  listing  of  the  error  values  that  will  be 
detected  or  corrected  when  the  code  is  used.  These  values 
are  determined  by  the  properties  of  the  code  and  are  in¬ 
dependent  of  the  logic  structure  of  the  computer  in  which 
the  code  will  be  used.  Value  effectiveness  for  100  percent 
detection  (or  correction)  of  some  class  of  error  values  has 
been  the  main  measure  of  arithmetic  codes.  For  example,  sin¬ 
gle  error  detection  (or  correction)  is  said  to  occur  when 
all  (100  percent)  errors  of  value 

ter1  0<c<r  0<i<m-l 

are  detected  (or  corrected)  in  an  m-digit,  radix-r  number. 
There  is  no  direct  reference  for  algorithms  or  their  imple¬ 
mentation.  Codes  with  value  effectiveness  of  less  than  100 
percent  detection  are  useful  when  their  cost  is  low  and  when 
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other  means  of  fault  tolerance  supplement  the  codes  in  a 
computer . 

2)  Fault  Effectiveness:  The  purpose  of  arithmetic  error 
codes  in  digital  systems  is  to  detect  the  occurrence  of  log¬ 
ic  faults.  The  detection  enables  the  system  to  initiate 
corrective  action  (error  correction,  diagnosis,  program  res¬ 
tart,  etc.).  In  order  to  assess  the  effectiveness  of  fault 
detection,  the  value  effectiveness  of  a  code  must  be 
translated  into  a  measure  of  fault  effectiveness  for  one  or 
more  specified  types  of  logic  faults.  The  translation  is 
performed  separately  for  every  algorithm  and  requires  an  er¬ 
ror  table  for  every  type  of  fault.  The  error  table  is  gen¬ 
erated  from  the  description  of  the  logic  implementation  of 
the  algorithms.  The  specified  fault  is  applied  to  every 
logic  circuit  which  is  used  by  the  algorithm.  Every  applica¬ 
tion  yields  an  error  value  (or  a  set  of  error  values)  by 
which  the  fault  will  change  the  perfect  value  of  the  result 
to  the  actual  (incorrect)  value.  The  error  table  lists  all 
error  values  together  with  their  relative  frequencies  of  oc¬ 
currence  during  the  compilation  of  the  error  table.  A  com¬ 
parison  of  the  error  table  with  the  detectable  error  values 
of  the  given  code  shows  which  entries  of  the  error  table  are 
not  detectable.  Therefore,  the  fault  effectiveness  of  a  code 
with  respect  to  the  given  algorithm  and  the  specified  fault 
is  the  percentage  of  all  occurrences  of  this  fault  which 
will  be  detected  (or  corrected)  when  the  given  code  is  em- 


ployed.  Less  than  100  percent  fault-effective  codes  are  of 
interest  when  their  cost  is  low,  because  other  methods  of 
fault  tolerance  can  be  used  of  reinforce  the  code.  If  the 
fault-effectiveness  for  an  algorithm  and  a  given  fault  is 
not  sufficient,  it  may  be  improved  by  redesigning  the  imple¬ 
mentation  of  the  algorithm  to  eliminate  some  or  all  of  the 
undetectable  entries  of  the  error  table  [AVI  71]. 

j>. 2^.1^  Error  Detection  Analysis  for  Residue  Encoding 

In  this  and  the  following  sections  we  focus  our  atten¬ 
tion  on  the  error-coded  on-line  divide  unit.  Error  detection 
capabilities  of  this  unit  when  residue  encoding  is  used  will 
be  considered  in  this  section  and  the  following  section  ad¬ 
dresses  the  same  problem  when  AN  codes  are  used. 

By  referring  to  the  block  diagrams  of  the  MAIN  and  the 
RESIDUE  units  shown  in  Chapter  4,  the  logic  faults  that  can 
happen  in  the  error-coded  units  can  be  divided  into  two 
parts: 

1)  Logic  faults  that  occur  in  the  SELECTION  blocks  of  the 
MAIN  and  the  RESIDUE  units.  These  faults  result  in  the 
selection  of  incorrect  quotient  digits  ( q ^  and  q' ^  ). 

2)  Faults  in  the  other  parts  of  the  units  including  faults 
in  Multi-input  Redundant  Adders,  Operand  Registers  and  the 
digit  transfer  operation. 
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The  proposed  residue  scheme  cannot  detect  the  first 
category  of  errors,  i.e.,  errors  in  the  SELECTION  Units.  The 
reason  is  that,  errors  in  and  q' ^  are  compensated  by 
the  resulting  errors  in  the  corresponding  partial  remainders 
Pj  and  P 1 j  .  This  is  due  to  the  step  4  of  the  "MAIN 
DIVIDE"  and  "RESIDUE  DIVIDE"  algorithms.  But  this  type  of 
error  can  easily  be  detected  by  the  range  test  of  the 
corresponding  partial  remainders  P^  and  P'^  .  The  follow¬ 
ing  theorem  proves  this  claim. 

Theorem  (1):  Any  deviations  of  the  selected  quotient  digits 
from  the  correct  value  will  result  in  a  partial  remainder 
which  is  out  of  bounds. 

Proof 


The  maximum  and  minimum  values  of  the  j-th  partial 
remainder  (Pj)  with  non-redundant  operands  have  been  derived 
in  [GOR  80]  and  are: 

kDj  -  r-8>  Pj  1-kDj  +  kr-8  (5.9) 
Similarly,  P'^  is  bounded  by: 

k’D1.  -  rl_8>  P*.  i-k'D1.  +  k-r1-8  (5.10) 

An  error  in  either  SELECTION  units  may  increase  (or  de¬ 
crease)  the  value  of  the  j-th  quotient  digit  q^=i  (q'j*!1) 
by  the  amount  of  E  (E*).  Figure  (5.2)  shows  a  rPj~Dj  plot 
and  the  corresponding  qj=i  selection  regions  of  the  MAIN 
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Unit . 


Assume  that  the  (j-l)-th  partial  remainder  is  in  the 
following  range  (shaded  area). 

Li«i  rPj-i  i°t-i  (5-n) 

It  is  clear  that  the  only  acceptable  value  of  the  j-th  quo¬ 
tient  digit  is: 


q.j -SELECT  ( rP ^  ,  D .  _x  )  =  i 


(5.12) 


Now  assume  that  due  to  an  error  in  the  SELECTION  Unit, 
the  actual  (incorrect)  value  of  q^  is: 


(5.13) 


qj=l+E 


The  following  two  equations  have  been  derived  in  [GOR  80]: 

-8+1 
-8+1 


Uj-d+lOD  j-r' 


L^*(i-k)D  j+kr 


(5.14) 


Inserting  (5.14)  into  (5.11)  we  get: 


(i+l-k)D_._1+kr-8+1.>  rp^  ^(k+i-1  )D  j_1-r"8+1 


(5.15) 


Using  these  maximum  and  minimum  values  of  rPj_i  in  the  basic 


recursion  step  (Eq.  3.12),  the  bounds  on  P^  are  obtained: 


-S+1  -8 

MAX(P  j )= ( i+1 -k ) D j_^+kr  -  (  i+E  )D..  + ( r-1  )  r 


-S+i  -S 

MIN(Pj)*(k+i-l  )Dj_1-r  -(i+E)D..-k(r-l  )r 


(5.16) 

(5.17) 


Assuming  (5.16)  and  (5.17)  reduce  to: 


D  D 

The  allowable  values  of  8  are  [GOR  80]: 
-8+1.  2k-l 


l+r-S+l.r-S 

(5.18) 

-kr~®+1+kr~® 

(5.19) 

4). 


*  -  k+1  ] 

Inserting  (5.20)  in  (5.18)  and  (5.19)  we  get: 


(5.20) 


MAX(P ^ )= (k-E )D j-r 


-8 


(5.21) 


MIN(P..)»  -(k+E)D j+kr 


-8 


(5.22) 

Clearly,  when  E*0  these  bounds  should  be  equal  to  those  ob¬ 
tained  previously  (see  Eq.  5.9).  But,  when  E#0,  it  is  easy 


to  prove  that  the  resulting  is  out  of  bounds. 
Case  1)  E  is  Positive: 


Assume: 


C  lj 

MAX ( P  j  )  =  ( k-E  )D  j  -r 


(5.23) 


Inserting  the  value  of  L  from  (5.9)  in  (5.23)  we  get: 

~ P 

-§ 


(2k-E)D j£(l+k)r 
Replacing  S  from  (5.20)  we  get: 


E  >2e+± 

—  r 

p  is  defined  to  be  in  the  following  range: 


(5.24) 


r“1i  P 

Using  this  relation  in  (5.24)  results  in: 


E>l+i 


(5.25) 


This  means  if  E>2  then  P.  will  be  out  of  the  correct 

1 

bounds.  Note  that  the  given  derivation  was  for  the  worst 
case  and  usually  the  smallest  possible  value  of  E  (E=*l)  will 
generate  a  partial  remainder  which  is  well  out  of  bounds. 
This  can  be  an  indication  of  an  error. 


Case  2)  E  is  Negative: 

Similar  analysis  follows  in  this  case  and  the  result 

is: 
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Therefore ,  in  order  to  detect  errors  in  the  SELECTION 


units,  we  can  compare  the  magnitude  of  the  resulting  partial 
remainders  with  the  maximum  and  the  minimum  allowable 
values.  If  this  value  is  within  the  range  then  no  error  has 
occurred.  But,  if  it  is  out  of  bounds  and  the  CHECK  unit 
does  not  indicate  an  error,  then  the  SELECTION  unit  of  the 
corresponding  module  is  malfunctioning. 


The  second  type  of  faults  that  we  are  concerned  about 
are  those  that  do  not  affect  the  selection  function.  They 
may  occur  in  other  parts  of  the  units  including,  the  regis¬ 
ters  which  hold  the  operands,  multi- input  redundant  adders, 
carry  generation  blocks  and  partial  remainder  registers. 
These  errors  are  detectable  by  the  proposed  residue  scheme 
as  long  as  the  value  of  the  error  is  not  divisible  by  the 
check  constant  A.  Referring  to  algorithm  "DETECT  DIVIDE" 
this  means  if  Z^Z'j  for  j=l ,  2,  .  .  .  .  ,m. 

Since  the  MAIN  and  the  RESIDUE  units  are  totally 
separate,  compensation  of  errors  does  not  happen.  But  er¬ 
rors  will  remain  undetectable  if  they  occur  in  only  one  unit 
and  I -*1^=0  or  in  two  units  and  1-*^  -  *2^A”0’ 


As  an  example  assume  an  error  in  the  multi-input  redun¬ 
dant  adder  of  the  MAIN  Unit  changes  the  perfect  value  of 

* 


partial  remainder  P  .  to  an  actual  (incorrect)  value  P. 


r 


Therefore,  referring  to 

the  algorithm 

"DETECT  DIVIDE"  we 

have: 

z* j=ix* j*(Yj”ipj 

'a>'a 

(5.27) 

z' .=  lx ' . * ( y  .-|p  . 
3  DID 

'a>'a 

(5.28) 

Z  .=  |X  . * (Y 1  .-|P’  . 

3  3  3  3 

*A^  *A 

(5.29) 

Subtracting  (5.27)  from  (5 

.29)  and  taking 

the  residues  of 

both  sides  with  respect  to  A  we  get: 

lzj-z'jlA'lrrz'jlA=lx'j*(lE'rpjlA)|A  <5-3°> 

* 

The  difference  between  P^  and  is  the  error  (-<). 

Inserting  this  in  (5.30)  we  get: 

IZl'Z'jlA=iX'3*<lA'!IX'jlA*l<lA|A  l5'31> 

Therefore,  the  error  will  go  undetected  if  and  only  if: 


1X'jlA“° 

or 


ma=° 

* 

Because,  when  Zj»Z' ^  step  3  of  the  algorithm  "DETECT  DIVIDE" 
cannot  catch  the  error. 


For  single  error  we  have: 
**±Cr”^  p>_  C  >1 


MAIN  UNIT 


132 


«=+C'r,’:*  p’i  C*  >1  RESIDUE  UNIT 

=  |  r 1  |  = 

A  1  'A 


Assuming  !r|  =|r‘  L*1  we  get: 


Therefore,  single  digit  errors  will  go  undetected  if: 


I C I A=0 


(5.32) 


1C  |.=0 

A 


(5.33) 

But  for  the  RESIDUE  Unit  the  following  relation  is  satisfied 
(Section  3.1.3)  . 


c'l  P*  1  1  r*-l 

From  (5.34)  we  deduce  that: 


(5.34) 


C'  <A 

Therefore,  (5.33)  can  never  be  satisfied  unless  C'=0.  This 
proves  that  all  single  digit  error  in  the  RESIDUE  Unit  will 
be  detected  by  the  proposed  scheme.  Similar  errors  in  the 
MAIN  Unit  may,  in  some  cases,  go  undetected. 

Assuming  that  each  radix  r(r*)  digit  is  shown  by 
^log2rj  ([log2r'j)  bits  inside  the  machine,  all  single  bit 
errors  are  detectable  as  long  as  A  is  a  low-cost  modulus.  An 
Example  of  the  error  detection  capabilities  of  the  residue 
encoding  is  shown  in  Section  (3.1.4). 


£.2.2  Error  Detection/Correction  Analysis  for  AN  Encoding 
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When  AN  codes  are  used  in  an  on-line  unit,  the  checking 
procedure  is  simply  finding  the  residue  of  each  single  digit 
of  the  results,  as  they  are  generated,  with  respect  to  the 
check  modulus  A.  If  the  operation  of  the  AN  coded  unit  is 
fault-free,  then  all  these  residues  should  be  equal  to  zero. 
A  non-zero  residue  is  an  indication  of  the  error.  In  Chapter 
3  of  this  thesis  we  depicted  the  block  diagram  of  an  AN  cod¬ 
ed  on-line  divide  unit.  In  this  unit  the  check  is  performed 
on  the  quotient  digit  (q'j)  and  the  corresponding  partial 
remainder  (P1^).  Denote  the  number  of  bits  required  to 
represent  n' y  d'^  and  q* ^  by  a* ,  p*  and  y*  respectively. 
Looking  back  into  equations  (3.30),  (3.31)  and  (3.32)  we  ob¬ 
tain  s 


oc,=flog22A2p'  '] 
p1  =  [log22Ap1] 

y  =  f1og22Af] 


(5.35) 

(5.36) 

(5.37) 


For  instance  q*  ^  is  represented  by  (  y '  — 1  )  bits  for  the 
magnitude  and  one  bit  for  sign.  If  2*s  complement  number 
system  is  used,  q' ^  can  be  represented  by: 


§  y 1 

[  j“(Xy,^^,Xyi_2/  ''*/X^|Xq)s  — 2  Xy,_i 


r-2  > 
i.  X.  2* 

k=0 


(5.38) 
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Error  Detection 


Single  error  detection  requires  that  the  minimum  dis¬ 
tance  between  coded  numbers  be  2,  that  is,  no  two  coded 
numbers  be  a  distance  1  apart.  Thus  for  all  permissible  q*  ^ 
and  q* j  (or  n*^  and  n* ^  or  d' .  and  d'j) 

q’  i-q*  j=A(qi-q;.  )^2k 

This  can  be  assured  by  choosing  A  to  be  odd.  The  choice 
A=3  will  detect  any  single  error  in  the  binary  representa¬ 
tion  of  the  operands  and  the  results.  Notice  that  this 
detection  of  single  errors  does  not  depend  on  at',  p'  and  Y' 
and  therefore  does  not  depend  on  the  radix  of  implementation 
(r) .  This  means  no  matter  how  large  a* ,  p'  and  Y'  are,  only 
two  additional  bits  are  sufficient  for  detection  of  a  single 
error . 

Error  Correction 

Error  correction  can  also  be  done  if  the  distance 
between  coded  numbers  is  greater  than  2.  For  single  error 
correction  d=3  is  sufficient.  The  following  theorem  speci¬ 
fies  the  range  of  the  numbers  in  which  a  single  error  can  be 
corrected  using  the  check  modulus  A  [PET  72]. 

Theorem  (2):  For  any  choice  of  A,  if  N  is  restricted  to  the 


range 


(5.39) 


^y-M2(A,d-l  )<  N  <jM2(A,d) 

then  AN  code  has  minimum  distance  of  at  least  d. 

In  case  of  division  +p>^  q^  >_-p  and  if  d-3  then  using 
the  above  theorems 

p<^M2(A,3)  (5.40) 

As  an  example  when  r=l 0  and  p=5  then  A=19.  There¬ 
fore,  if *  we  multiply  every  digit  of  the  dividend  (N)  and  the 
divisor  (D)  by  A=19  before  sending  them  to  the  on-line 
divide  unit,  then  a  single  error  in  each  of  the  quotient  di¬ 
gits  (q^)  can  be  corrected.  An  example  of  error-detection 
with  AN-coded  operands  is  shown  in  Section  (3. 1.2. 3). 

An  Example  of  Error  Correction 

i  i 

Assume  r=10,  p=p ’ =p  -5  and  A=1 9  for  single  error 
correction.  Prom  (3.30),  (3.31)  and  (3.32)  we  get: 

f 

n’j  *  {-1085, . ,-361,0,361 . ,1805} 

d'j,  q'j  {-96,-76,  ...,-19,0,19,  ...,76,96} 

and 

y 1 - ^log2  2Apj *8  bits 

Therefore,  q' ^  is  represented  by  8  bits  inside  the  machines 
q'  j-x?x6, . *xl*x0 
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A  single  error  can  be  in  any  of  these  8  positions  and 

Table  (5.2)  shows  that  all  these  single  errors  are  correct- 

* 

able  once  the  residue  of  q'  j  (incorrect  value  of  qV)  with 
respect  of  A  is  obtained  by  the  CHECK  Unit. 


+ - + - + - - - + 

I  i  I  bit  in  I  I q 1 , L  I 

I  I  error  I  3  A  I 

+ - + - + - + 

I  0  I  x-  I  1,18  I 

+ - + - - - + - + 

111  x,  |  2,17  | 

+ - + - + - + 

I  2  I  x,  I  4,15  I 

+ - + - £ - + - + 

I  3  I  x,  I  8,11  I 

+ - + - 2 - + - + 

I  4  I  x  |  16,3  I 

+ - + - + - + 

I  5  I  x-  I  13,6  I 

+ - + - 2 - + - + 

I  6  I  x,  I  7,12  I 

+ - + - - - + - + 

I  7  I  x_  |  14,5  I 

+ - + - + - + 

Table  (5.2)-  Single  Error  Correction 


Since  the  residues  shown  in  Table  (5.2)  are  unique,  a 
single  error  in  any  of  the  8  positions  can  be  corrected 
without  ambiguity.  The  following  is  a  block  diagram  of  the 
CHECK  Unit  for  this  example. 
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CHAPTER  6 


CONCLUSIONS  AND  SUGGESTIONS  FOR  FUTURE  RESEARCH 

In  this  thesis  we  have  presented  a  method  for  detection 
and  correction  of  errors  in  on-line  arithmetic  algorithms. 
This  method  is  based  on  low-cost  arithmetic  error  codes  and 
encodes  each  digit  of  the  operands  separately.  The  encoded 
operands  pass  through  the  arithmetic  unit  digit-by  digit, 
most  significant  digit  first.  The  proposed  algorithms  are 
such  that  they  preserve  the  codes,  therefore  each  digit  of 
the  result  must  conform  with  its  code.  In  this  way,  and 
depending  on  the  code  used,  errors  in  each  single  digit  of 
the  result  can  be  detected  or  corrected.  The  need  for  such  a 
detection/correction  scheme  arises  from  the  fact  that  on¬ 
line  arithmetic  requires  relatively  long  sequences  of  opera¬ 
tions  in  order  to  achieve  speed-up  over  conventional  arith¬ 
metic.  Therefore,  it  is  important  to  protect  them  against 
hardware  failures.  If  not  protected,  the  hardware  failures 
could  quickly  contaminate  large  number  of  results  in  pro¬ 
gress  due  to  tight  coupling  of  the  steps  at  the  digit  level. 
By  detecting  errors  as  they  occur,  an  effective  gracefully 
degradable  organization  could  be  achieved.  This  means,  error 
at  the  j-th  step  would  lead  to  restriction  of  precision 
(significance)  of  the  remaining  steps  but  not  catastrophic 
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termination 


We  presented  two  methods  for  such  a 
detection/ correction  scheme:  1)  Residue  encoding;  2)  AN  en¬ 
coding,  In  the  first  method,  residue  of  every  digit  of  the 
operands  with  respect  to  a  constant  is  attached  to  it  and  is 
sent  to  the  on-line  unit*  Two  separate  processors,  process 
the  operands  and  their  residues.  The  result  generated  can  be 
checked  for  having  the  correct  residue  with  respect  to  the 
same  constant.  In  this  way,  we  proved  that  a  single  error  in 
each  digit  of  the  operands  and  the  corresponding  results  can 
be  detected.  Also,  we  proved  that  an  error  in  the  selection 
of  the  result  digits  can  be  detected  even  without  using  the 
proposed  residue  scheme.  It  is  interesting  to  note  that,  no 
new  algorithms  are  necessary  for  the  residue-coded  operands. 
The  algorithms  are  the  same  as  those  developed  for  ordinary 
operands.  The  only  new  algorithm  that  is  needed  is  the  one 
used  by  the  detection  process. 

In  the  second  approach,  each  digit  of  the  operands  is 
multipied  by  a  constant  before  entering  the  on-line  network. 
This  code  is  preserved  throughout  the  network  and  each  digit 
of  the  final  result  can  be  checked  for  divisibility  by  the 
same  constant.  We  showed  that  depending  on  the  check 
modulus,  a  single  error  in  each  digit  of  the  result  can  be 
detected  or  corrected . 
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The  error-coded  algorithms  and  a  block  diagram  imple¬ 
mentation  of  the  corresponding  units  have  been  presented  in 
this  thesis.  A  detailed  design  of  a  digit-sliced  on-line 
division  unit  was  also  considered.  This  unit  was  designed  as 
a  set  of  basic  Processing  Elements  (PE)  each  of  which 
operates  on  a  single  digit  of  the  operands  and  the  results. 
Assuming  that  the  radix  of  implementation  is  r  («2  ),  number 

of  gates  required  for  one  PE  has  been  proved  to  be  propor- 
.  2 

tional  to  k  .  Also,  number  of  pins  required  is  proportional 
to  k.  In  short,  we  showed  that  the  number  of  gates  vary  from 
350  to  5500  for  radices  2  to  256. 

Processing  time  of  a  PE  is  also  an  important  factor  and 
was  determined  to  be  in  the  range  of  39  to  116  gate  delays 
for  the  aforementioned  range  of  radices.  Finally,  we  extend¬ 
ed  this  on-line  unit  to  encompass  the  residue-coded 
operands.  We  proved  that  the  imposition  of  residue  codes  on 
on-line  division  unit  increases  the  gate  requirements  by  no 
more  that  39%.  The  checking  procedure  can  be  overlapped  with 
the  operation  of  the  MAIN  Unit  and  in  that  sense  there  is  no 
time  penalty  for  introducing  error-codes  into  the  on-line 
division  unit. 

There  remain  several  areas  of  interest  that  need  furth¬ 
er  research  such  as  the  extension  of  this  work  to  other 
functions  such  as  logarithmic,  trigonometric,  and  exponen¬ 
tial.  It  is  apparent  that  the  E-method  [ERC  75]  is  a  good 


candidate  for  such  an  extension.  It  is  believed  that  the 
detection/correction  procedure  outlined  in  this  thesis  can 
be  applied  directly  in  this  and  similar  cases.  Another 
point  of  interest  is  the  actual  implementation  of  the  on¬ 
line  processing  units  in  VLSI.  Also  the  simulation  of  the 
proposed  detection/ correction  schemes  and  experimental  vali¬ 


dation  of  the  code  effectiveness  warrants  further  research. 
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APPENDIX  A 


HARDWARE  DESIGN  OF  AN  ON-LINE  DIVISION  UNIT 

In  this  appendix  we  consider  the  hardware  implementa¬ 
tion  of  the  MAIN  DIVIDE  Unit-  Since  MAIN  and  RESIDUE  Units 
have  similar  organizations,  it  is  obvious  that  the  same 
design  can  be  applied  to  a  RESIDUE  Unit  with  minor  modifica¬ 
tions  . 

In  order  to  design  such  a  unit  we  assume  that  the  on¬ 
line  unit  consists  of  a  linear  cascade  of  identical  Process¬ 
ing  Elements  (PEs)-  Each  PE  is  a  complex  logical  module  and 
contains  logic  to  perform  on-line  operation  under  the  con¬ 
trol  of  the  Global  Control  Unit  (GCU )  . 

Figure  (A.l)  shows  the  schematic  organization  of  on¬ 
line  division  unit  along  with  the  GCU. 

EU  performs  the  exponent  calculations. 

END  UNIT  allows  the  last  PE  to  be  identical  to  all  the  other 
PEs  as  far  as  interface  is  concerned. 

The  PEs  collectively  contain  the  fractional  parts  of 
all  active  operands,  one  digit  in  each  PE.  Most  significant 
digits  are  in  PE^  and  least  significant  digits  in  PE^ .  Out¬ 
put  digits  are  generated  by  the  most  significant  Processing 
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Element  in  an  on-line  mode  and  are  placed  on  the  Z-Bus. 
Each  output  digit  is  stored  by  all  PE 1 s  temporarily  and  at 
the  same  time  reaches  the  next  on-line  unit. 


After  receiving  the  output  digit  and  the  transfer  in¬ 
formations  from  the  right-hand  neighbor,  each  PE  starts  the 
computation  and  generates  one  digit  of  the  partial 
remainder.  Depending  on  this  partial  remainder  and  the  trun¬ 
cated  version  of  the  divisor,  next  quotient  digit  is  select¬ 
ed  by  PE^  and  is  placed  on  the  output  bus.  This  operation 
continues  until  the  required  precision  is  obtained. 


In  order  of  determine  the  operation  of  each  PE.  we  look 
at  the  basic  recursion  formula  for  on-line  division  (Equa¬ 
tion  3.12): 


Pj'rPj-l-qjDj+nj+8r'S  -  Qj-ldj+Sr'§  (A-I) 

Assuming  that  each  of  the  operands  are  m  digits  long  we 

have: 


m 


j+8 
D=  z  d.r 
3  i-1  1 


-1 


(A. 2) 


(A. 3) 


p?=  The  i-th  digit  of  the  j-th  partial  remainder  (this  digit 
is  in  PE^ )  . 

n^  ,  d^=  The  i-th  digit  of  the  operands  (resident  in  PE^). 


The  digits  processed  by  PE^  in  step  j  of  the  algorithm 
will  be  obtained  by  the  following  picture: 


rpj-l  P1  p2  * 

*  *P1+S 

D .  »0*d, . . . 

j  ^1 

*  Q  *  * 

nj+gr“®  «=0*0 .  .  . . 

.°nj+gl 

Qj-lr-8  _0*0 _ 

• 0qiq2 

*  is  the  decimal  point 


’pi+l 


li* ‘ -uj+8* 


,qi+l-8 . qj-l0. 


’pn  1(3 


Therefore,  the  digits  processed  by  PE^  are  obtained  and 
(A.l)  becomes: 

fi3)=Pl*I1)-‘ijdi+"j+«Ci'S3-'’i+l-8dj+§tTi3)-rTl-l  (A-4) 

where  nj+§Ei=S]  means  nj+g  will  be  added  in  PE^  only  if  i=§. 
T( j)  _  transfer  digit  from  PEi+1  at  the  j-th  step 
“  transfer  digit  to  PE.^  at  the  j-th  step 

It  is  obvious  that  q^+^.g  is  zero  in  the  Processing 
Element  not  in  the  following  range: 

qi+l-S  j-2+8>^  i  >_S 

qi+l-8*  0  otherwise 

>  (A. 5) 

Using  Eq.  (A. 4)  the  following  picture  for  the  on-line  divide 

unit  will  be  obtained  (Figure  A. 2). 
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In  order  to  eliminate  carry  propagation  between  the 
PEs ,  we  assume  that  each  digit  of  the  partial  remainder 
(pj^)  in  is  represented  by  an  interim  partial  remainder 
<w<j))  and  a  transfer  digit  (t|^)  such  that: 


p  f  ^  f  3 )  (  j  ^ 


(A. 6) 


The  transfer  function  in  Eq.  (A. 4)  is  obtained  by  a 
series  of  three  transformations  f^  ,  f2  and  f^  such  that: 


pl(j)  l(i) 

£1  !  -'’i+l-S  dj+g'rti-l  *"1 
f2  !  -’j*di=rti-l(:')',wi(:') 


(A. 7) 


The  transfer  digits  from  PE^  to  PEi_l  are 

P^j)  P2(i} 

fci-l  ani*  fci-l  resulting  from  transformations  f^  and  f2 

respectively.  Also  there  is  a  transfer  digit  out  of  the 

A(  i ) 

Multi-input  Adder  (t.'"').  Therefore: 


substituting  (A. 6),  (A. 7)  and  (A. 8)  in  (A. 4)  we  get: 


(A. 8) 


Wi  M+I1 )+”-( j)+wi ( j) 

+nj+g[i-8]-rt^1D) 

T(D),tPl(j)..P2(j)..A(j) 
i  i 


p(3)^(j)„(j)w(3)+^l(j,+tPj(j>+tMj)  (A.9) 
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Figure  A. 3  -  Functional  Representation  of  the  Digit 
Algorithm  for  On-Line  Division 


Transformation  essentially  requires  a  radix-r 

multi-input  adder  which  forms  the  sum  of  the  digits  of  both 
signs.  This  adder  is  implemented  as  a  k-stage  ( r-2  )  linear 
cascade  of  a  radix-2  multi-input  adder  where  each  input  of  a 
radix-2  adder  can  assume  three  values  {1,0,1}.  The  organi¬ 
zation  of  this  adder  is  shown  in  Figure  (A. 4). 

The  products  qj*d^  and  _g*^ j+g  are  generated  by  two 

separate  product  matrix  generators  which  consist  of  a  k*k 

square  array  of  redundant  binary  product  cells.  Each  cell 

★ 

performs  the  product  of  two  redundant  binary  digits  q.  and 

3i 

* 

d^  and  its  output  product  digit  is  also  in  the  digit  set 
m 

{1,0,1}.  Figure  (A. 5)  shows  the  operation  of  the  digit  pro¬ 
duct  generators  f^  and  f ^  ( k*4 ) . 

Therefore,  transformation  f 3  requires  k  MIRBAs  (Multi- 
Input  Redundant  Binary  Adder) ,  each  capable  of  summing 
2(k+l)  redundant  binary  inputs,  as  well  as  the  'Transfer' 
from  the  adjacent  MIRBA  position  [GOY  76].  Figure  (A. 6) 
schematically  shows  the  implementation  of  f^  for  radix  16, 


Fifiirt  A.4  -  Functional  R 


Product  Matrix  Generator  (Radix  -  16). 


n 


A .  1^  Design  of  a  MIRBA 

MIRBA  is  a  limited  carry/borrow  propagation  §<H#r 
accepts  several  redundant  binary  inputs  (digit  *•*  1 

and  produces  one  redundant  binary  output  (with  r  «' 

adder  Transfers  for  more  significant  adjacent 

Using  Rohatsch's  technique  [ROH  67],  a  10  mr;* 
can  be  realized  with  four  simple  transformation*  r  i 
( A*  7 )  shows  one  such  four  level  (each  level  indi  a*** 
box)  adder  which  is  applicable  for  k<6. 

Another  way  of  implementing  MIRBA' s  is  the  Log-sup  *  r  «•** 
technique.  In  this  scheme  each  MIRBA  can  be  implemented  b> 
a  log-sum  tree  structure  of  two  input  redundant  binary 
adders  (Borovec  Unit  [BOR  68]).  For  a  2(k+lJ  input  MIRBA, 
the  tree  structure  has  L  levels  of  Borovec  Units  ( BU )  such 
that  s 

L=|‘log22(k+l)r|  (A.  10) 

and  the  number  of  BU's  required  is  (2k+l).  Figure  (A. 8) 


shows  the  log-sum  tree  structure  for  a  1_0  input  MIRBA. 


LEVEL  1 

M 


10  INPUTS  e 


M 


o.i 


0,1 

■0.1 


1.0.1 

T.o.i 

T.o.i 


Fiyurt  A.7  -  llluctration  of  tha  Atpfarafc  Dattgn  of  a  MfRBA  Urirtf  Stmph  Trantf or  motion* 


Figurs  A.8  -  Illustration  of  Lov-Sum  Tm  Structure  for  • 
MIR  BA  Using  Borovtc  Units  Only.  (K-4) 


A._2  Logic  Design  of  The  Processing  Element 

The  major  components  of  the  PE  are  the  Register  File 
for  the  storage  of  active  operands,  The  Digit  Processing 
Logic  (DPL )  which  is  essentially  a  large  combinational  logic 
circuit  and  Local  Control  Unit  (LCU)  which  supplies  the  con¬ 
trol  signals  in  proper  order  to  condition  the  combinational 
DPL,  Figure  (A. 9)  shows  the  schematic  block  diagram  of  a 
Processing  Element,  The  register  file  comprises  a  set  of 
digit-wide  registers  which  are  used  to  hold  the  operand  di¬ 
gits  and  the  result  digits. 

The  DPL  operates  on  the  operand  digits  stored  in  the 
register  file  of  the  PE  and  the  informations  received  from 
its  right  neighboring  PEs,  It  also  generates  Transfer  infor¬ 
mation  for  its  left  neighbor  PE.  The  LCU  issues  the  timing 
control  signals  to  the  processing  logic  for  sequencing  the 
various  steps  of  the  digit  algorithm. 

The  register  file  is  a  set  of  registers  that  are  used 
to  hold  the  operands  and  result  digits.  Each  PE  retains  one 
digit  of  each  of  the  active  operands.  Each  register  is  (k+1) 

bits  long  to  hold  the  k-magnitude  bits  and  one  sign  bit  of 

•  •  k 
one  sign  and  magnitude  encoded  radix-2  digit. 

There  must  be  at  least  seven  registers  in  a  PE.  One  for 
the  dividend,  one  for  divisor,  one  for  quotient  digit  and 
one  for  interim  partial  remainder  ( w f  ^ ) .  Three  other  regis- 


OPERANDS  AND  THE  RESULT  BUS 


FROM  GCU 


Flyura  A.9  -  Block  Diagram  of  a  Proaanint  Element. 


ters  are  used  to  hold  the  transfer  functions  (t|^  )  coming 
from  •  In  the  next  step  of  the  computation  ( j+1 )  these 

functions  are  gated  to  PE^^  along  with  wj^.  They  consti¬ 
tute  the  operands  of  PE^_^  in  step  { j+1 ) . 

There  are  other  registers  in  a  PE  which  are  used  to 
hold  the  intermediate  results.  These  registers  are  located 
in  DPL  and  will  be  shown  later. 

The  registers  in  the  register  file  are  loaded  from  a 
buffer  register,  IBR  whose  contents  are  determined  by  the 
internal  Register  Input  Bus  Selector,  SRIB  in  the  Digit  Pro¬ 
cessing  Logic.  Similarly,  the  contents  of  the  registers  are 
inputed  to  the  DPL  either  directly  or  through  an  Output  Bus 
Selector  SROB,  also  in  DPL. 

A  .3^  Block  Diagram  Description  of  DPL 

Figure  A.  10  shows  the  data  flow  structure  of  the  Digit 

Processing  Logic  (DPL)  in  a  block  diagram  form.  It  consists 

of  three  major  components-  the  Digit  Product  Generator,  DPG, 
k 

a  radix-2  multi-input  adder  MIAD,  and  a  Digit  Sum  Encoder, 
DSE.  DSE  converts  the  redundant  binary  sum  output  of  adder 
MIAD  to  the  Sign  and  Magnitude  format  for  local  storage  in 
the  Register  File,  or  transfer  out  of  the  PE. 

As  shown  in  Figure  (A. 10)  are  input  and  output  ports 
designated  as  TIP^  ,  RIP^  and  TOP^,  ROP^ ,  respectively.  The 
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■mm 


TO  REGISTER 


input  port  TIP^  carry  the  'Transfer*  (carry  or  borrow)  from 
adjacent  MIAD  and  the  contents  of  some  register  in  the  Re¬ 
gister  File  of  the  adjacent  PE^+^.  RIP^  carry  the  quotient 
digit  from  PE^+^_g.  The  output  ports  TOP^  and  ROP^  carry 
similar  information  for  PE^_^  and  PE^+^_g  respectively. 


V 

A. 4  Logic  Design  of  a  Radix  2  Multi-Input  Adder  (MIAD) 

In  general,  a  radix-2  multi-input  adder  consists  of  a 
linear  cascade  of  k  MIRBAs .  A  2jk+l^  input  MIRBA  is  imple¬ 
mented  as  a  tree  structure  of  BUs  (see  Pig.  (A.8)).  Each 
MIRBA  requires  2k+l  BUs  and  are  arranged  in  L=^log22 (k+1 )| 
levels.  Therefore: 


gmiad 


:k(2k+l)GBU 


t  =T,*S 

MIAD  U  ®BU 

G^iad  “Number  of  Gates  Required  for  One  MIAD 
tMiAD  “  Delay  of  One  MIAD 

G0U  =  Number  of  Gates  Required  for  One  BU 
6gy  =  Delay  of  One  BU 


(A. 11) 
(A. 12) 


For  a  2(k+l )  input  adder,  the  number  of  pins  required 
for  the  input  and  output  adder  transfers  ^  and  t^j^ 
are  2 (2k+l )  each  (see  Figure  A.8). 
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A. 5  Logic  Design  of  DPG 


The  Digit  Product  Generator  forms  the  product  array  of 
•  •  }c  .  . 

two  signed  radix-2  digits.  It  accepts  the  two  digits  encod¬ 
ed  in  Sign  and  Magnitude  format  and  outputs  the  product  ar¬ 
ray  in  redundant  binary.  The  logical  design  of  DPG  is  shown 
in  Figure  ( A. 11 ) . 


The  number  of  gates  required  for  each  DPG  is  [GOY  76]: 
k2  AND  GATES 


k2  XOR  or  2k2  AND 


SM/RB  k^  AND 
NONE 


for  LVEj 
for  LVE, 


for  LVE. 


(A. 13) 


The  pins  contributed  by  DPGs  to  the  pin  complexity  of 

Pi(j) 

DPL  are  those  pins  which  are  required  for  t^  ^  , 


p2(j)  p^j-n 


and  t 


p2( j-1) 


k ( k— 1 ) 

No.  of  pins  for  a  transfer  signal=l+  ^ — - 


(A. 14) 


A. 6  Logic  Design  of  Digit  Sum  Encoder 


The  Digit  Sum  Encoder  (DSE)  transforms  the  redundant 

•  k 

binary  sum  output  of  the  radix-2  adder  into  an  algebraical- 

k 

ly  equivalent  radix-2  sum  digit  in  Sign  and  Magnitude  for¬ 
mat  for  either  local  storage  in  the  Processing  Element  or 
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transmission  out  of  the  PE.  Total  number  of  gates,  GDSE 
quired  by  DSE  logic  has  been  found  to  be  CGOY  76]: 


re- 


GDSE 


16k  for  LVEj  and  LVE^ 
26k  for  LVE^ 


(A. 15) 


A. 7  Logic  Design  of  Selectors  SRIB , SROB , STOP  and  STIP 


The  selector  SRIB  is  a  seven  input  multiplexor.  It  con¬ 
stantly  examines  the  data  on  D,  Q  and  N  Busses.  If  the  data 
on  any  of  these  busses  belong  to  PE^,  it  writes  this  data  in 
the  corresponding  registers  in  the  register  file.  It  also 
gates  the  output  of  DSE  (w|^)  to  Register  RW  in  the  Regis- 

H)  ,  AH)  pl(j)  p2(j) 

ter  Pile.  The  transfer  function  T;J'  (=t.  ,t.  ,t.  ) 

1  1  1  1 

which  should  be  sent  to  in  (j+l)-th  step  is  gated 

through  this  selector  to  Register  File  for  temporary 
storage.  The  width  of  the  selector  is  obtained  by  the  fol¬ 
lowing  equations 


b=MAX 


k+1'PtA(j)'P  p,(j)'P  P-Cj) 

1  V  V 


tA<  j)! 


Pin  Count  of  t^^*2(2k+l) 


pl t ^  k(k-l ) 

P  Pin  Count  of  t.  *1+ 

Pl  (j)  1  2 
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b=2(2k+l)  (A. 16) 

The  logic  design  of  SRIB  is  similar  to  that  shown  in 
[GOY  76]  and  the  number  of  gates  required  is: 

GSRIB:b+b+2(I+  k('V1'")+4(k'>'1 188*2*11**10  (A.  17) 

The  selector  SROB  selects  the  contents  of  one  of  the 
registers  of  the  Register  File  on  to  the  Register  File  Out¬ 
put  Bus  (ROB) .  The  gates  required  for  this  network  are 
dependent  on  the  number  of  registers  in  the  Register  File 

and  the  bit  width  of  the  registers.  There  are  seven  regis- 

.  ,  .  k 

ters  in  the  Register  File.  For  radix-2  ,  four  of  them  are 

(k+1)  bits  wide,  one  is  2(2k+l)  bits  wide  and  the  other  two 

are  (1+  ^ — “)  bits  wide.  Therefore,  the  gate  requirements 

of  SROB  are  exactly  same  as  that  of  SRIB,  that  is: 

°SROB-*2+llk*10  (A-18) 

The  width  of  selector  STOP  is  equal  to  the  width  of 
output  port  TOP^  .  The  width  of  TOP^  is  determined  by  the 
maximum  length  of  "Adder  Transfers".  Therefore,  the  width  of 
TOP^  is  given  by  Eq.  (A. 16).  Logic  implementation  of  STOP 
is  shown  in  Figure  (A. 12).  From  the  given  design  we  con¬ 


clude  that: 


GSTOP~3b+2(1+  (A.  19) 

The  selector  STIP  is  actually  a  four  output  demulti¬ 
plexor.  The  width  of  STIP  is  exactly  the  same  as  that  of 
STOP  and  is  therefore  equal  to  b.  The  logic  implementation 
of  STIP  is  simple  and  the  number  of  gates  required  for  this 
element  is: 

GSTIP=b+k+1+2(1+  ~(~T  1- '  )=*2+4*+5  (A.  20) 

Figure  (A. 13)  shows  the  logic  implementation  of  this  selec- 
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A. 8  Storage  Buffer  Registers  of  PPL 


DPL  has  ten  buffer  registers,  through  R^  and  IBR. 
The  width  of  each  of  these  registers  has  been  indicated  in 
Table  (A.l). 


+ - + - + 

I  Buffer  Reg  I  Width  bits  I 
+ — - + - + 

I  R,  1 2 (2k+l )  I 

+ - i - + - + 

I  R~  I k+1  I 

+ - - - + - + 

I  R,  I 1+k ( k-1 ) /2 I 

+ - 

I  R.  I 1+k (k-1 ) /2 I 

+ - + - + 

I  R5  I k+1  I 

+ - + - + 

l  R .  I k+1  I 

+ - 2 - + - + 

I  R7  I k+1  I 

+ - + - + 

I  Rp  I k+1  I 

+ - 2 - + - + 

I  Rq  I k+1  I 

+ - - - + - + 

I  IBR  1 2  (2k+l )  I 

+ - - - + - + 

Table  (A.l)-  Width  of  The  Registers  in  DPL 
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A. 9  Design  of  SM/RB ,  CHS1  and  CHS2  Blocks 

SM/RB  block  encodes  the  input  which  is  represented  in 
Sign  and  Magnitude  format  to  redundant  binary  representa¬ 
tion.  There  are  nine  distinct  ways  that  we  can  encode  a  sign 
and  magnitude  number.  The  simplest  one  is  the  encoding  that 
assigns  the  sign  of  the  number  to  all  the  bits.  Adopting 
this  simple  encoding,  there  is  no  gate  requirement  for  SM/RB 
block.  Therefore: 


GSM/RB_0  (A. 21) 

CHS1  and  CHS2  are  sign  changers  and  since  their  inputs 
are  in  Sign  and  Magnitude  format,  they  can  be  implemented  by 
a  single  inverter  gate.  Therefore: 


GCHS1=GCHS2  1 


(A. 22) 
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A, 10  Design  of  The  Quotient  Selection  Unit 

The  selection  of  the  quotient  digits  is  done  by  the 
most  significant  Processing  Element  (PE^).  The  quotient  di¬ 
git  selector  inside  PE^  is  a  table  look-up  device  which  im¬ 
plements  the  SELECT  function  (see  Algorithm  MAIN  DIVIDE).  It 
examines  Y  most  significant  digits  of  rPj_^  and  <3  most  sig¬ 
nificant  digits  of  D^^,  in  order  to  select  the  appropriate 


quotient  digit,  q^. 


According  to  Eq.  (A. 9)  : 


m 


£p(3-i>.r-ia  SIw(H)w}H)]r-i 


Pj-i*.1- Pi 

J  i=»i 


i=l 


(A. 23) 


Therefore,  truncated  version  of  (i.e.,  )  is: 


P,  ,»  t  Cwj^^+Tp'1  ^]r-1 

3  i*l  1 


(A. 24) 


This  means  T^'s  and  w^‘s  can  be  used  as  the  address 
lines  of  an  ROM  device  implementing  the  SELECT  function.  It 
is  not  difficult  to  see  that  even  for  small  radices  the 
number  of  input  lines  to  the  device  will  be  prohibitive  [IRW 
77].  Two  techniques  to  avoid  this  dilemma  are  available:  1) 
Use  a  PLA,  or  2 )  Perform  Carry  Propagation  on  the  most  sig¬ 
nificant  portion  of  to  reduce  the  number  of  lines  re¬ 
quired.  Irwin  shows  that  the  number  of  input  line  will  be 
reduced  by  up  to  44%  if  this  technique  is  used  [IRW  77].  In 
estimating  the  cost  of  the  Processing  Elements  we  have  ig¬ 
nored  the  cost  of  the  Selection  Block.  Because,  it  effec¬ 
tively  appears  in  only  one  PE  (PE^).  The  time  required  by 
the  selection  process  has  been  estimated  to  be  of  the  order 
of  4-5  gate  delays  [ERC  75,  IRW  77].  In  the  delay  analysis 
of  the  division  unit  we  assume 


t  =*t  »4S 

select  s  g 


A. 11  Gate  Complexity  of  Digit  Processing  Logic 


The  total  number  of  gates  we  require  for  the  implemen¬ 
tation  of  DPL  is  the  sum  of  all  the  gates  we  require  for 
each  of  its  components.  From  Equation  (A. 11)  we  have: 

GMlAD*lc^2k'"1  *GBU 

Each  Borovec  Unit  (BU)  requires  26  gates  [GOY  76]. 
Therefore,  the  total  number  of  gates  required  for  the 
Multi-input  Adder  iss 


“hud-®1®11 

other  components  of  DPL  require  the  following  number  of 
gates: 


G 

G 

G 

G 

G 

G 

G 

G 


DPG (1  ) 


*k2  AND  +2  XOR  =k2+8  GATES 


DPG (2) 


=k+8  GATES 


DSE 


=16k 


SRIB 


=k  +llktlO 


SROB 


=k  +llkHO 


STOP 


=k  +llkt8 


STIP 


*k  +4kt5 


SM/RB 


Adding  these  together  we  gets 


G__t -5  8k  2+7  9k+  5 1 

Urii 

Table  (A. 2)  shows  the  gate  complexity  of  DPL. 


(A. 25) 


A. 12  Pin  Complexity  of  DPL 

The  pins  required  for  digit  processing  logic  DPL  is  the 
sum  of  the  pins  necessary  for  input  ports  TIP^,  RIP^  and 
output  ports  TOP^  and  ROP^ .  The  total  number  of  pins,  pdpl 
necessary  for  logic  implementation  of  DPL  is  equal  to  the 
sum  of  the  pins  required  for  input  and  output  ports. 

p  =p  +p  +p  +p 

DPL  TOPj^  ROPi  TIPi  RIPi 
from  (A. 22)  we  have: 


PTip  “PTOP  =P  A(i-l)SB2(2k‘‘1) 

TI  i  TOPi  tJJ.J  ’  (A.  26) 

and  since  the  information  on  RIP^  is  a  single  digit  then: 


P  sP  sVf 1 
*ROP .  kIP .  *  1 
1  1 


(A. 27) 

plugging  (A. 26)  and  (A. 27)  in  the  equation  for  PQpL  we  get: 


PDPL=10kV6 


(A. 28) 


EneotfMf  of  a  Radundant  Binary  Difit. 


A. 13  Overall  Logic  Complexity  of  a  PE 

The  total  number  of  gates,  GpE,  required  for  the  imple¬ 
mentation  of  a  PE  is  the  sum  of  the  gates  required  for  the 
combinational  logic  of  OPL,  the  gates  required  for  the  PE 
control  logic  and  the  gates  required  for  the  implementation 
storage  registers  in  the  PE.  The  storage  registers  in  a  PE 
comprise  the  registers  in  the  Register  File  and  buffer  re¬ 
gisters  in  DPL.  Using  Table  (A.l)  the  number  of  gates  needed 
for  storage  is: 

°ST0"C6  (kl’ 1  >+4  < 2kh 1  )+k  < K-l  )+2  ]Gd 
«(k2+l  3kH2)GD 

GD  is  the  number  of  gates  required  for  the  realization  of  a 
D  type  flip-flop.  Assuming  GD*6  CTEX  69]  we  get: 

GSTO=6k2+78kt72  (A. 29) 

Ignoring  the  number  of  gates  needed  for  PE  control ,  the 
number  of  gates  required  for  each  PE  is: 

gpessGdpl+gsto 

Substituting  the  values  from  Equations  (A. 25)  and  (A. 29)  we 
get: 

G  *64k2+157kH23  (A.  30) 

r£ 

The  pin  requirements  for  each  PE  is  the  sum  of  pins  re¬ 
quired  for  DPL  plus  the  number  of  pins  needed  for  input  and 
output  busses  ( ignoring  the  pins  required  for  control  signal 
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from  3t:u ) .  That  is 


P  «P  +P  4-P  +p 

PE  DPL  N-BUS  D-BUS  Q-BUS 

or 

PpE»13X«-9  (A.  31) 

The  pin  and  gate  requirements  of  DPL  and  PE  along  with 
the  gate  requirement  of  other  PE  components  have  been  shown 
in  Table  ( A. 3 ) . 
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APPENDIX  B 


TIME  (DELAY)  CONSIDERATIONS  OF  AN  ON-LINE  DIVISION  UNIT 

Time  required  to  compute  a  single  quotient  digit  (k+1 
bits)  is  composed  of  the  following  elements  (see  Algorithm 
MAIN  DIVIDE). 

1.  Time  to  select  a  quotient  digit  (tg) 

2.  Time  to  update  Q ^  and  registers  (tu) 

3.  Time  to  perform  the  basic  recursion  formula  (tR) 

The  following  diagram  indicates  the  relative  position 
of  these  three  delays  with  respect  to  one  another. 


V5*  V*  A*B,VC 

i 

SELECTION 


r 


hi 


t  UPDATE 


UPOATE  j 


n.  .  ANDd.  ,  ARRIVE 
f* «  +1  ,  i+J+1 


RECURSION 


SELECTION 


>1 


STEP  ‘ 


Since  usually  t  and  tD  are  greater  than  t„,  the  total 

S  K  U 

time  for  one  step  of  the  algorithm  (TSTEp)  is: 


(B.l) 


TSTEP“ts+tR 

Each  step  starts  when  the  digits  of  the  dividend  (n^+g) 
and  divisor  (dj+g)  appear  on  the  input  busses  (N-BUS  and  D- 
BUS).  At  the  beginning  of  each  step  selection  of  the  quo¬ 
tient  digit  (q^)  is  initiated  by  the  quotient  selection  unit 
in  the  most  significant  Processing  Element  (PE^).  This 
selection  is  based  on  the  truncated  version  of  the  previous 
partial  remainder  and  divisor  (0^^). 

PE^  outputs  q j  on  the  Q-BUS.  After  reception  of  this 
quotient  digit  and  some  other  informations  from  its  right 
neighbor,  each  PE  starts  processing  of  one  digit  of  the  next 
partial  remainder  (Pj).  After  certain  amount  of  time  (tpE), 
next  partial  remainder  will  be  available  in  a  redundant  for¬ 
mat  (wj^  and  t|^).  This  process  continues  until  required 
precision  is  obtained.  We  compute  tpE  by  measuring  the  time 
span  between  the  setting  on  all  registers  ( through  Rg )  in 
PE^  at  step  (j)  and  ( j+1 ) .  Therefore  (B.l)  can  be  rewritten 
as : 

TSTEP=ts+tPE  (B.2) 

Graph  representation  of  tg+tpE  is  shown  in  Figure  (B.l) 
[refer  to  block  diagram  of  the  MAIN  Unit  in  Appendix-A]. 
Using  this  graph  TgTEp  is  found  to  be: 


TSTEP“2tSRIB+tDSE+tMIAD+tSM/RB+tSTIP+tSTOP 


Figure  B.1  *  Graph  RgpremiUHon  of  T 


r 


+3tSROB+ts 


( B.  3 ) 


The  components  of  TgTEp  are  as  follows: 


The  time  required  by  the  selection  block  has  been  es¬ 
timated  to  be  in  the  order  of  4-5  gate  delays  [IRW  77].  As¬ 
sume  : 


t  =4§^ 
s  g 


(B.4) 


^RIB : 


Logic  design  of  Register  Pile  Input  Bus  Selector  (SRIB) 
is  given  in  Appendix-A.  According  to  this  design: 


tSRIB  2Sg 


( B .  5  ) 


'SROB 


Referring  to  Appendix-A 


t  =2§ 
rSROB  g 


Also  we  have; 


(B.6) 


t  =2  S 

STOP  *  g 


(B.  7) 


^HSl^g 


( B.  8) 


tDPG: 
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Assuming  LVE^  (Logic  Vector  Encoding)  for  the  operands 
[GOY  76],  and  according  to  what  has  been  explained  in  the 
design  of  the  Digit  Product  Generator  we  gets 


tDPG"tXOR 


fcMIAD: 


According  to  Eq.  (A. 12)  in  Appendix  A  : 


(B.9) 


t  =T*S 

MIAD  u  0BU 

such  that:  L=^log22 (k+1 ) j 
one  Borovec  Unit  [BOR  68]. 


(B. 10) 

and  §gy  is  the  time  required  by 
Using  LVE^,  §BU  is  obtained  to  be 


[GOY  76]  : 


Therefore: 


(B.ll) 


tMIAD-78gf1°Vk+1,l 


(B.  12 ) 


^se* 


From  the  design  given  in  [GOY  76]  tggE  can  be  estimated 
approximately  to  be: 


t~5k§+3k§=8k§ 
USE  g  g  g 


tSTIP: 


According  to  the  design  given  in  Appendix  A  : 


(B.13) 
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1 


f  ■§ 

^STIP  g 


( B. 14 ) 


Adding  the  components  in  Eq*  (B.3)  we  gets 
TSTEP*[8k+7fl092 <k+ 1 »l+24]«g 


(B.15) 


Table  (B.l)  shows  TgTgp  and  its  components.  From  this 
table  it  can  be  deduced  that  contribution  of  "Digit  Sum  En¬ 
coder"  (DSE )  to  the  total  step  time  dominates  all  other  com¬ 
ponents  for  relatively  large  radices.  But  this  unit  can  be 
eliminated  if  can  be  stored  in  redundant  format.  That 
is,  RW  and  Rj  should  be  made  to  be  double  bank  registers. 
Also  STIP,  SRIB,  SROB  and  STOP  blocks  should  be  modified. 


} 
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APPENDIX  C 


ON-LINE  MULTIPLICATION 

The  problem  of  on-line  multiplication  has  been  ad¬ 
dressed  by  Trivedi  and  Ercegovac  [TRI  77]  and  by  Irwin  [IRW 
77].  These  two  references  deal  with  on-line  multiplication 
when  the  operands  are  represented  in  a  non-redundant  number 
representation  system . 

The  purpose  of  this  appendix  is  to  present  a  systematic 
method  for  derivation  of  on-line  multiplication  which  is 
compatible  with  the  method  given  for  division  [GOR  80].  The 
problem  of  redundant  operand  multiplication  is  addressed  and 
it  will  be  proved  that  the  given  upper  bounds  for  the 
operands  in  the  aforementioned  references  are  pessimistic 
and  the  correct  value  will  be  derived. 


Redundant  Operands 


Let  the  radix  r  representation  of  multiplicand ,  multi¬ 
plier  and  the  product  be  denoted  by  X,  Y  and  R  respectively 
such  that: 


m 

X*  Z  x  .r 
i-1  1 


(C.l) 
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Y«  £  y.r- 
i-1  x 


R=  i.  p .  r" 
i=l 


(C.2) 


(C.  3 ) 


and  R*X*Y  to  m  digits  of  precision. 


We  assume  and  belong  to  the  following  redundant 
digit  set: 

xi'yi  4  l-p’ « • • • • • • »p’ )  (r-l)>  p1  ir/2  (C.4) 

p^  may  belong  to  a  different  redundant  digit  set: 

p ^  ^  { — p t«.*fl|Orlf*.*fp}  ( r-1 )  >_  p  ^ r /  2  ( C •  5 ) 

Redundancy  coefficients  of  X,  Y  and  R  are  defined  as: 

k"r^T  (r_1  )i  p  it/2 
k*=^j-  (r-1  )i  p'  ir/2 

k 

Assume  that  X  and  Y  are  bounded  by  a  positive  constant 
M  such  that: 


-M<  X, Y  <M 


(C •  6 ) 


M  specifies  the  maximum  and  the  minimum  values  that  the 
operands  can  assume  and  is  a  function  of  r  and  p.  This  value 
will  be  derived  later  in  this  appendix. 


The  algorithm  which  produces  the  product  of  two  redun¬ 
dant  operands  X  and  Y  is  called  the  Algorithm  "MULT"  and  is 
shown  below  [TRI  77]: 


\ 
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Algorithm  MULT 

Step  1  [Initialization]: 


t>0-°.xo-0.  Y0-0 


R0-0,  pQ-0 


For  ^*1 1 2#  < • • « i m  Do i 


Step  2  [Input  Digit] i 


xrxj-i+xjr 


-i 


Wi+yjr 

Step  3  [Basic  Recursion]: 

Step  4  [Selection]: 


p j“SELECT (P j ) 

R.*R.  ,+p.r~^ 
3  3-1  *3 

Step  5  [End  Do] 
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Proof  of  Convergence 

Inserting  different  values  of  j  into  the  basic  recur 
sion  formula  (Eq.  C.7)  we  get: 


j“l  Pl“Xlyl 

j=2  -*  P2=rX1y1+X2y2+Y1x2-rp1 
=r2X2Y2-rp1 

j“3  ->  P3*r3X3Y3-(rp1+p2) 

Continuing  this  procedure  P^  is  obtained  as  follows: 

P  .=r^X  .Y  .-(r^”2p, +r^_3p,+. . .+p^  ,  ) 
j  3  D  1  z  J 

ar^X^Yj-r^Rj,]^  (C.8) 

when  j=m 


m 


.r\Y-rmRfl  , 
m  ra  m-i 


From  (C • 3 )  we  have: 


(C.9) 


-m 


R=R  ,  +P_r 
m— 1  m 

Inserting  this  in  (C#9)  we  get: 


Pm»rmXY-rm ( R-pmr”m) 


or 


R.XY-r-m(Pm-Pl0) 


(C.10) 


By  devising  a  product  digit  selection  procedure# 
SELECT,  in  Step  4  of  the  Algorithm  "MULT",  such  that: 
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Ip  -pJ  <  k 

m  — 


(C.ll) 


R=X*Y  can  be  computed  to  m  digits  of  precision.  Note  that 
the  algorithm  as  it  stands  produces  just  the  most  signifi¬ 
cant  half  of  the  product.  The  least  significant  half  of  the 
product  is  available  as  the  redundant  output  of  the  adder 
after  iteration  m+1,  i.e.. 


m+l=Pm~^m 


(C. 12) 


By  feeding  these  redundant  adder  digits  directly  into 
the  recoding  unit,  the  least  significant  half  of  the  product 
can  also  be  output  in  conventional  form. 


Range  Restriction  Analysis 


Assume  that  the  required  SELECTION  process  in  Step  4  of 
the  Algorithm  "MULT"  is  found  and  the  graph  of  Figure  (C.l) 
is  obtained.  This  is  a  plot  of  partial  product  at  step  j 
versus  partial  product  at  step  ( j-1 ) .  This  plot  is  desig¬ 
nated  as  a  P-P  plot  [IRW  77],  By  analyzing  such  a  plot,  a 
product  digit  selection  procedure  can  be  specified  for  the 
given  r  and  p.  The  notations  used  in  this  graph  are  similar 
to  those  used  for  division  [GOR  80]. 


U^  -  Upper  bound  for  the  region  in  which 
L^  -  Lower  bound  for  the  region  in  which 

(Pj)*  -  The  j-th  partial  product  with  Pj_^*i  chosen  as  the 
product  digit. 
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w*uer 


only  thj  »•  ra  - 

h,"9  ve  aasume  est«ction  on 

«»  fact  that  „  h  ”*  b«i=  recucslon  .  th 

=1  «d  c,.  j  hOUld  b*  bounded  bv  a  (C-7' 

some  constants 


In  step  ( j-1 )  assume  p^_^=i  is  chosen.  Prom  the  basic 
recursion  formula  (C.7)  we  are  able  to  find  the  maximum 


value  that  Pj  can  assume. 


(Pj)m  H1  =rU.-ri+2Mp* 

(Pj,Sn1  “^.-ri-aMp1 

When  from  (C.13)  we  get: 


(C. 13 ) 

(C. 14) 


(pj,Mxl”P-rVrp+2Mp‘ 

In  order  for  to  be  bounded,  this  value  should  be 

equal  to  U  ,  therefore: 
r 


rUp-rp+2Mp'=Up 


or 


U  »rk-2Mk ' 

Similarly  for  the  lower  bound; 


(C. 15 ) 


(P  . )  ?  1  =L 
j  min  -p 

This  results  in: 


L  =-rk+2Mk '  (C.16) 

r 

from  (C.15)  and  (C.16)  we  get: 

rk-2Mk’ Pj  ^-rk+2Mk'  (C.17) 
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Selection  Region 


Selection  regions  can  be  obtained  by  the  help  of  Equa¬ 
tions  (C.13),  (C.14),  (C.15)  and  (C.16)  as  follows: 

P-i-l*1 

(P-iLL  <u«  for  a11  i's 
j  max  —  p 

Inserting  the  values  from  (C.13)  and  (C. 15)  we  get: 

Ui<i+k_2Mk’  (C. 18) 

Also  for  the  lower  bound  the  following  inequality  is  always 
satisfied: 

(P-iLL  >L  «  for  a11  i'» 

]  min  —  -p 

Using  (C.14)  and  (C.16)  we  get: 

Li^i-k+2Mk’  (C. 19 ) 

In  order  to  have  maximum  overlap  between  the  adjacent 
regions  i  and  (i+1),  should  be  as  large  as  possible  and 
as  small  as  possible.  Therefore,  from  (C.18)  and  (C.19) 
we  have: 


U . =i+k-2Mk ' 

l 

Lj-i-k+aMk' 

fc 

and  therefore: 

p.=i 

i+k-2Mk '  i  (Pj)  J  _>  i-k+2Mk ' 


(C. 20) 


(C.  21 ) 
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In  order  to  have  overlap  between  the  adjacent  regions 
the  following  inequality  should  be  satisfied: 


This  means,  the  maximum  allowable  values  of  the  multi- 

.  2k-l 

plicand  and  multiplier  is  equal  to  If  the  operands  are 

larger  than  this  bound,  then  there  will  be  a  "gap"  between 

adjacent  regions.  That  is,  there  will  be  some  values  of 

in  which  there  is  no  acceptable  product  digit  p ^ . 

For  example  when  r-2  and  k=k 1 =1 


So  shifting  the  operands  two  bits  to  the  right  will 
guarantee  the  convergence  of  the  algorithm* 

Letting  j=*m  in  Eq.  (C.21)  we  get: 

p  +k-2Mk ' >  P_  >p  -k+2Mk* 

*m  —  m  —Mn 

or 

k-2Mk 1 >  P  -p_  >-k+2Mk* 

—  m  m  — 

and  since  M>0  and  kf>0  then  Eq.  (C.ll)  is  satisfied  and  R  is 
indeed  the  product  of  X  and  Y  to  m  digits  of  precision* 
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REMARKS 


From  Eq .  (C.22)  it  is  clear  that  the  on-line  multipli¬ 
cation  is  not  possible  when  k»l/2  or  when  the  product  is  not 
redundant . 
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APPENDIX  D 


ON-LINE  ADDITION/SUBTRACTION 

In  this  appendix  a  systematic  derivation  of  on-line 
addition/ subtraction  algorithms  will  be  presented.  This 
method  is  compatible  with  the  methods  given  for  on-line  mul¬ 
tiplication  and  division  in  Appendix  C  and  [GOR  80]  respec¬ 
tively.  The  derivation  is  applicable  to  both  redundant  and 
non-redundant  operands.  But,  in  what  follows  we  only  consid¬ 
er  addition  and  subtraction  with  redundant  operands. 

Addition 

Let  the  radix  r  representation  of  addend,  augend  and 
sum  be  denoted  by  A,  B  and  R  respectively  such  that: 


m 

A*  E  a .  r” 1 
i=l 

(D.l) 

m 

B=  Z  b.r-1 
i=l  1 

(D.2) 

m 

R=  Z  s ■ r-1 
i=0  x 

(D.  3) 

and  R*A+B  to  m  digits  of  precision. 

We  assume  a^,  b^,  and  belong  to  three  different 
redundant  digit  sets: 
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^  •*  {-p’  ,  . . ,  1, 0, 1,  .  .  ,p'  }  (r-U^p'^r/2 


(D.4) 


^  1 “p  **fp  ]  ( r- 1 ) ^p  ^ r /  2  ( D .  5 ) 


^  { — p (••(l|0|lj««)p}  ( r-1  )>.  p  / 2 


(D.6) 


Redundancy  coefficients  of  A,B  and  R  are  defined  as: 


k-£r 


-  -&t 

k*j?r 


(D.7) 


The  algorithm  of  the  next  page  generates  the  sum  of  A 
and  B  in  on-line  mode.  We  call  this  algorithm  "ADD".  This 
algorithm  is  a  modification  of  the  addition  algorithm  shown 
in  [IRW  77]. 
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Algorithm  ADD 


Step  1  [Initialization]: 


P0=° 


8-l*° 


R-l“° 


For  1,2, ••••# m+ 1  Do : 


Step  2  [Input  Digit] i 


a^  and 


step  3  [Basic  Recursion]: 


Pj*r(P..1-sj_2)+(aj+b.)r 


-1 


(D.  8) 


Step  4  [Selection]: 


Sj^sSELECTUPj  ,Cj) 


Step  5  [End  Do] 
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Proof  of  Convergence 


Inserting  different  values  of  j  into  the  basic 
sion  formula  (Eq.  D.8)  we  get: 


j-2 

j-3 


pi*(al+bi )r 


-1 


P2“(al+bl)+(a2+b2)r’1”rsO 


p3=r(a1+b1 )+(a2+b2 )+(a3+b3 )r 


-1 


-r  Sq-tSj 


Therefore,  is: 


or 


P j=r^  1 ai+bi )r_1+(a2+b2 )r~2+. . .+(a j+bj )r" 
■r3"1 [so+r"lsi+ • * -+r"( 3-2  >s j_2] 

1-1  2  -i  -i-ljZ2  -i 

P.*r3  (a.+b.  )r  x-r3  1  s.r  1 


i»l 


l  l 


i*0 


if  j*m+l,  then: 


m-1 

-l  _m  —  -l 


p_i»rm  i.  (a.+b.  )r-i-rm  i.  s.r 
m+1  i»l  11  i=0  1 


Using  (D.3)  we  get: 


m-1 

R«  s.r“1+smr“m 
i-0  1  m 


or 


recur- 


j] 


(D.9) 


(D. 10) 
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r 


m-1  . 

i.  s  •  r  =R-s  r  m 
i»0  1  m 


inserting  this  into  (D.10)  we  obtains 


rearranging  the  terms  we  get: 

R»(A+B)-r_m(Pm+1-sm)  (D.ll) 

By  devising  a  sum  digit  selection  procedure,  SELECT,  in 
step  4  of  the  Algorithm  "ADD",  such  that: 


'Vi-V  ±  k 

R=A+B  can  be  computed  to  m  digits  of  precision. 


(D.12) 


Selection  Rules 

Figure  (D.l)  shows  a  selection  graph  for  the  operation 
of  addition.  This  is  a  plot  of  shifted  partial  sum  versus 
the  sum  of  the  operand  digits  (a^  and  bj)  at  step  j.  We 
designate  this  as  P-c  plot.  The  notations  used  in  this  graph 
are  similar  to  those  used  in  Appendix  C. 

In  order  to  derive  the  range  restriction  on  P^,  our 
only  assumption  will  be  the  basic  recursion  formula  (D.8) 
and  the  fact  that  P^  should  bounded. 
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I 

i' 


Figure  D.l  -AP-CPVH 
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(D.13) 


Pj"r(Pj-l-aj-2)+Cjr 

Cj'aj+bj 


-1 


The  maximum  and  the  minimum  values  of  Pj  are  obtained 


as  follows: 


s-i-2ml  -1 

(P  .)  J  2  -U.-ri+c .r  1 
j  max  l  j 


s-i-2*1  -1 

(P.)  ?  2  -L.-ri+c.r  1 
j  min  i  j 


(D. 14 ) 

(D . 15 ) 


when  Sj_2”P  from  (D.14)  we  get: 

(v»ix2'p-vrp+<:jr' 


since  should  be  bounded,  this  value  should  be  equal  to 
U 

therefore: 


Vrp+Cjr_1“ 

or 


vr  k  ■  r=r 


Similar  to  this,  for  the  lower  bound  we  obtain: 


(D. 16) 


®4_2*~ P  —1  . 

'‘’j’mtn  -L-p+rP+Cjr  " 


This  results  in: 


-P 


-a  -  3- 


from  (D.16)  and  (D.17),  (D.18)  will  be  obtained: 


(D. 17 ) 
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Thus,  the  upper  and  lower  bounds  of  rPj  are  implied  by  the 
recursion  formula  and  not  by  the  selection  procedure  (as 
shown  in  [IRW  77]). 

Selection  regions  are  obtained  using  Equations  (D.14), 
(D.15),  (D.16)  and  (D.17)  as  follows: 

s .  ,=i  U 

‘Vmic  i  -f  Maui's 

plugging  Equations  (D.14)  and  (D.16)  in  the  above  inequality 
we  get : 

c  . 

U^rti+k)  -  ^  (D .  19 ) 

Similarly  from  Equations  (D.15)  and  (D.17)  and  the  re¬ 
lation: 

s.  9=*i  L 

(PjJ"  >  — for  all  i*s 
j  min  —  r 

we  get: 

Cj 

r(i-k)  -  -p^j-  (D.  20) 

In  order  to  have  maximum  overlap  between  the  adjacent 
regions  of  the  P-c  plot,  the  equality  signs  of  Equations 
(D.19)  and  (D.20)  should  be  satisfied.  Therefore: 


L^*r(i-k)  - 


(D.22) 


thus,  the  selection  regions  are  specified  by  the  following 
inequality: 


'-a  S  c, 

r  (  i+k )  -  rP  j  >r(i-k)  - 


(D.23) 


In  order  to  have  overlaps  between  the  adjacent  regions 
of  the  P-c  plot,  should  be  greater  than  for  all  i's. 
That  is: 


Ui^Li+l 


Therefore,  there  is  always  overlap  between  the  adjacent  re¬ 
gions  of  the  P-c  plot. 


In  order  to  prove  that  the  relation  (D.12)  is  satisfied 
by  the  given  selection  procedure,  we  rewrite  Eq.  (D.23)  as: 


r(s._i+k)  -  rP.  >r(sj_1-k)  -  ^ 


r(  r-1 )—  P  j~ 3  j — 1  — ~k  “  r(  r-1 ) 


when  j=m+l  we  obtains 


lr  Cm+1  IP  n  -  Cm+1 

k  rTr=T )-  pm+i-V-  ~k  ~  FTFTT 


since  a  .,«b  Ll  *0 
m+1  m+1 


cm+^*0  and  therefore: 


pm+l~8m 
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so  condition  (D.12)  is  satisfied  by  the  given  selection  pro¬ 
cedure  and  R  is  indeed  the  sum  of  A  and  B  to  m  digits  of 
precision. 


Subtraction 

Since  the  subtrahend  is  represented  in  redundant  for¬ 
mat,  subtraction  can  be  performed  by  just  flipping  the  sign 
of  the  subtrahend  digits  and  following  the  addition  pro¬ 
cedure  given  above . 
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The  main  objective  of  this  dissertation  is  to  develop  and  demonstrate  the  feasibi¬ 
lity  of  error-coded  on-line  arithmetic  suitable  for  distributed  systems. 

In  this  thesis  a  set  of  error-coded  on-line  algorithms  was  developed  for  the  four 
basic  operations  of  addition/subtraction,  multiplication  and  division.  Low  cost 
arithmetic  error  codes  (Residue  and  AN  Codes)  were  found  to  be  suitable  for  this 
purpose. 

Hardware  design  of  the  error-coded  units  at  the  gate  level  was  considered.  A 
residue-coded  on-line  division  unit  was  designed  based  on  a  already  designed 
digit-slice  division  unit. 

A  general  mathematical  model  for  the  cost  and  speed  of  the  error-coded  units  was 
derived  and  was  compared  with  similar  values  when  no  error  code  is  used.  Finally, 
the  effectiveness  of  the  proposed  detection/correction  schemes  was  considered  and 
proved. 
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